thasso.xyz20231210T09:22:37+00:00https://thasso.xyzParsing Expressions by Recursive Descent in Haskell20231031T00:00:00+00:00https://thasso.xyz/2023/10/31/parsingexpressionsbyrecursivedescentinhaskell<p>Parsing numerical expressions by recursive descent is a joy in Haskell! It is incredibly concise and elegant, yet very simple.</p>
<p>What we want to parse are binary expressions like <code class="languageplaintext highlighterrouge">7 + 42 * 9</code>, <code class="languageplaintext highlighterrouge">2 * 3 / 4 * 5</code>, or <code class="languageplaintext highlighterrouge">8 * (10  6)</code>. As always, when parsing such expressions, we have to be aware of the associativity of the operators involved and of their different levels of precedence. In this case it’s simple: <code class="languageplaintext highlighterrouge">+</code>, <code class="languageplaintext highlighterrouge"></code>, <code class="languageplaintext highlighterrouge">*</code>, and <code class="languageplaintext highlighterrouge">/</code> all associate to the left, and <code class="languageplaintext highlighterrouge">*</code> and <code class="languageplaintext highlighterrouge">/</code> have higher precedence than <code class="languageplaintext highlighterrouge">+</code> and <code class="languageplaintext highlighterrouge"></code>.</p>
<p>This means that we want to turn the above expressions into the following ASTs.<sup id="fnref:1" role="docnoteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup></p>
<p><code class="languageplaintext highlighterrouge">7 + 42 * 9</code> ⇒ <code class="languageplaintext highlighterrouge">7 + (42 * 9)</code>. <code class="languageplaintext highlighterrouge">*</code> has higher precedence than <code class="languageplaintext highlighterrouge">+</code>, so although they both associate to the left, <code class="languageplaintext highlighterrouge">*</code> binds tighter than <code class="languageplaintext highlighterrouge">+</code>.</p>
<p><img src="https://thasso.xyz/public/figures/ast1.png" alt="Figure 1: AST of 7 + 42 * 9" /></p>
<p><code class="languageplaintext highlighterrouge">2 * 3 / 4 * 5</code> ⇒ <code class="languageplaintext highlighterrouge">((2 * 3) / 4) * 5</code>. <code class="languageplaintext highlighterrouge">*</code> and <code class="languageplaintext highlighterrouge">/</code> have the same precedence and associate to the left.</p>
<p><img src="https://thasso.xyz/public/figures/ast2.png" alt="Figure 2: AST of 2 * 3 / 4 * 5" /></p>
<p><code class="languageplaintext highlighterrouge">8 * (10  6)</code>. Parentheses have the highest precedence.</p>
<p><img src="https://thasso.xyz/public/figures/ast3.png" alt="Figure 3: AST of 8 * (10  6)" /></p>
<p>The following grammar encodes the precedence and associativity constraints above. It is also not leftrecursive, and can be used in a recursive descent parser.<sup id="fnref:2" role="docnoteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup></p>
<p><img src="https://thasso.xyz/public/figures/grammar.png" alt="Figure 4: A grammar for parsing expressions" /></p>
<p>Instead of using algorithms like Shunting Yard or precedence climbing, the precedence of the operators is encoded directly in the various production rules. This is the simplest approach to take, but it works well in the implementation. Nora Sandler presents this method, and explains how to get there <a href="https://norasandler.com/2017/12/15/WriteaCompiler3.html">here on her blog</a>. I recommend reading <a href="https://www.engr.mun.ca/~theo/Misc/exp_parsing.htm">this article</a> by Theodore Norvell if you want to learn more about paring expressions. It explains both the Shunting Yard algorithms and precedence climbing.</p>
<p>How would this grammar parse an expression like <code class="languageplaintext highlighterrouge">7 + 42 * 9</code>? It starts at <code class="languageplaintext highlighterrouge">7</code>, goes down the leftmost derivation of both <code class="languageplaintext highlighterrouge">expr</code> and <code class="languageplaintext highlighterrouge">term</code>, and then chooses the <code class="languageplaintext highlighterrouge">num</code> alternative in <code class="languageplaintext highlighterrouge">factor</code>. Next, <code class="languageplaintext highlighterrouge">+</code> is consumed by the optionally repeated part of <code class="languageplaintext highlighterrouge">expr</code>, and we go down another <code class="languageplaintext highlighterrouge">term</code>, with <code class="languageplaintext highlighterrouge">42 * 9</code> as the rest of the input. The recursion mechanism at work here defers the partial tree consisting of <code class="languageplaintext highlighterrouge">(+ 7 <term>)</code> that we have parsed so far. Starting at <code class="languageplaintext highlighterrouge">42</code>, <code class="languageplaintext highlighterrouge">term</code> now goes down the leftmost <code class="languageplaintext highlighterrouge">factor</code> again. This <code class="languageplaintext highlighterrouge">factor</code> becomes another <code class="languageplaintext highlighterrouge">num</code>, consuming <code class="languageplaintext highlighterrouge">42</code> from the input. Now <code class="languageplaintext highlighterrouge">*</code> is consumed by the optionally repeated part of <code class="languageplaintext highlighterrouge">term</code>, and then <code class="languageplaintext highlighterrouge">factor</code> consumes the last numeric literal <code class="languageplaintext highlighterrouge">9</code>. In total, the second <code class="languageplaintext highlighterrouge">term</code> in the <code class="languageplaintext highlighterrouge">expr</code> production rule produces the tree <code class="languageplaintext highlighterrouge">(* 42 9)</code>. Now that the end of the input has been reached, this tree is used to complete the first partial tree. This way we get <code class="languageplaintext highlighterrouge">(+ 7 (* 42 9))</code> as the result.</p>
<h1 id="implementation">Implementation</h1>
<p>We’ll use the <a href="https://hackage.haskell.org/package/megaparsec">Megaparsec</a> library of parser combinators for our implementation. The <a href="https://markkarpov.com/tutorial/megaparsec.html">Megaparsec tutorial</a> is quite thorough, and I recommend you give it a read if you want to use Megaparsec.</p>
<p>First off, let’s define a representation of the ASTs we wish to create:</p>
<div class="languagehaskell highlighterrouge"><div class="highlight"><pre class="highlight"><code><span class="c1"> Expr.hs</span>
<span class="kr">data</span> <span class="kt">Expr</span>
<span class="o">=</span> <span class="kt">Add</span> <span class="kt">Expr</span> <span class="kt">Expr</span> <span class="c1"> +</span>
<span class="o"></span> <span class="kt">Sub</span> <span class="kt">Expr</span> <span class="kt">Expr</span> <span class="c1"> </span>
<span class="o"></span> <span class="kt">Mul</span> <span class="kt">Expr</span> <span class="kt">Expr</span> <span class="c1"> *</span>
<span class="o"></span> <span class="kt">Div</span> <span class="kt">Expr</span> <span class="kt">Expr</span> <span class="c1"> /</span>
<span class="o"></span> <span class="kt">Num</span> <span class="kt">Int</span>
<span class="kr">deriving</span> <span class="p">(</span><span class="kt">Show</span><span class="p">,</span> <span class="kt">Eq</span><span class="p">)</span>
</code></pre></div></div>
<p>The first <code class="languageplaintext highlighterrouge">Expr</code> represents the lefthand side of the binary expressions, and
the second <code class="languageplaintext highlighterrouge">Expr</code> represents the righthand side.</p>
<p>Next, we’ll need to define some helpers to start parsing. Here we mostly use the combinators found in <a href="https://hackage.haskell.org/package/base4.16.3.0/docs/ControlApplicative.html"><code class="languageplaintext highlighterrouge">Control.Applicative</code></a> and in Megaparsec’s <a href="https://hackage.haskell.org/package/megaparsec9.6.0/docs/TextMegaparsecCharLexer.html">Lexer module</a>.</p>
<div class="languagehaskell highlighterrouge"><div class="highlight"><pre class="highlight"><code><span class="c1"> Expr.hs</span>
<span class="kr">import</span> <span class="nn">Data.Void</span>
<span class="kr">import</span> <span class="nn">Control.Applicative</span> <span class="k">hiding</span> <span class="p">(</span><span class="nf">many</span><span class="p">)</span>
<span class="kr">import</span> <span class="nn">Text.Megaparsec</span>
<span class="kr">import</span> <span class="nn">Text.Megaparsec.Char</span>
<span class="kr">import</span> <span class="nn">Text.Megaparsec.Char.Lexer</span> <span class="k">as</span> <span class="n">L</span>
<span class="kr">data</span> <span class="kt">Expr</span> <span class="o">=</span> <span class="c1"> [...]</span>
<span class="kr">type</span> <span class="kt">Parser</span> <span class="o">=</span> <span class="kt">Parsec</span> <span class="kt">Void</span> <span class="kt">String</span>
<span class="n">spaceConsumer</span> <span class="o">::</span> <span class="kt">Parser</span> <span class="nb">()</span>
<span class="n">spaceConsumer</span> <span class="o">=</span> <span class="kt">L</span><span class="o">.</span><span class="n">space</span> <span class="n">space1</span> <span class="n">empty</span> <span class="n">empty</span>
<span class="n">pSymbol</span> <span class="o">::</span> <span class="kt">String</span> <span class="o">></span> <span class="kt">Parser</span> <span class="kt">String</span>
<span class="n">pSymbol</span> <span class="o">=</span> <span class="kt">L</span><span class="o">.</span><span class="n">symbol</span> <span class="n">spaceConsumer</span>
<span class="n">pLexeme</span> <span class="o">::</span> <span class="kt">Parser</span> <span class="n">a</span> <span class="o">></span> <span class="kt">Parser</span> <span class="n">a</span>
<span class="n">pLexeme</span> <span class="o">=</span> <span class="kt">L</span><span class="o">.</span><span class="n">lexeme</span> <span class="n">spaceConsumer</span>
<span class="n">pNum</span> <span class="o">::</span> <span class="kt">Parser</span> <span class="kt">Expr</span>
<span class="n">pNum</span> <span class="o">=</span> <span class="kt">Num</span> <span class="o"><$></span> <span class="n">pLexeme</span> <span class="kt">L</span><span class="o">.</span><span class="n">decimal</span>
</code></pre></div></div>
<p><code class="languageplaintext highlighterrouge">pSymbol</code> and <code class="languageplaintext highlighterrouge">pLexeme</code> consume all white space <em>after</em> they are parsing. They don’t consume initial white space, so be careful about that. Now we can already parse numbers.</p>
<pre><code class="languageplain">λ :l Expr
[1 of 2] Compiling Main ( Expr.hs, interpreted )
Ok, one module loaded.
λ parseTest (pNum <* eof) "7"
Num 7
λ parseTest (pNum <* eof) "43587"
Num 43587
λ parseTest (pNum <* eof) "blah"
1:1:

1  blah
 ^
unexpected 'b'
expecting integer
λ parseTest (pNum <* eof) "92 * 4"
1:4:

1  92 * 4
 ^
unexpected '*'
expecting end of input
</code></pre>
<p>As you can see, different numbers are all parsed correctly and invalid inputs are rejected with nice error messages generated by Megaparsec.</p>
<p>Let’s now start implementing the parser. We’ll build it up from the bottom, starting with <code class="languageplaintext highlighterrouge">factor</code>.</p>
<div class="languagehaskell highlighterrouge"><div class="highlight"><pre class="highlight"><code><span class="c1"> Expr.hs</span>
<span class="c1"> [...]</span>
<span class="n">inParens</span> <span class="o">::</span> <span class="kt">Parser</span> <span class="n">a</span> <span class="o">></span> <span class="kt">Parser</span> <span class="n">a</span>
<span class="n">inParens</span> <span class="o">=</span> <span class="n">between</span> <span class="p">(</span><span class="n">pSymbol</span> <span class="s">"("</span><span class="p">)</span> <span class="p">(</span><span class="n">pSymbol</span> <span class="s">")"</span><span class="p">)</span>
<span class="n">pFactor</span> <span class="o">::</span> <span class="kt">Parser</span> <span class="kt">Expr</span>
<span class="n">pFactor</span> <span class="o">=</span> <span class="n">inParens</span> <span class="n">pExpr</span> <span class="o"><></span> <span class="n">pNum</span>
<span class="n">pExpr</span> <span class="o">::</span> <span class="kt">Parser</span> <span class="kt">Expr</span>
<span class="n">pExpr</span> <span class="o">=</span> <span class="n">undefined</span>
</code></pre></div></div>
<p>How do we define <code class="languageplaintext highlighterrouge">pExpr</code>? It should parse a single term, and then go on to parse an infinite number of plus or minus characters, each followed by another term. <code class="languageplaintext highlighterrouge">term</code> has the same shape as <code class="languageplaintext highlighterrouge">expr</code>, so once we know how to implement <code class="languageplaintext highlighterrouge">expr</code>, we can also implement <code class="languageplaintext highlighterrouge">term</code>. Parsing the first term is simple:</p>
<div class="languagehaskell highlighterrouge"><div class="highlight"><pre class="highlight"><code><span class="c1"> Expr.hs</span>
<span class="c1"> [...]</span>
<span class="n">pTerm</span> <span class="o">::</span> <span class="kt">Parser</span> <span class="kt">Expr</span>
<span class="n">pTerm</span> <span class="o">=</span> <span class="c1"> [...]</span>
<span class="n">pExpr</span> <span class="o">::</span> <span class="kt">Parser</span> <span class="kt">Expr</span>
<span class="n">pExpr</span> <span class="o">=</span> <span class="kr">do</span>
<span class="n">lhs</span> <span class="o"><</span> <span class="n">pTerm</span>
<span class="c1"> ...</span>
</code></pre></div></div>
<p>A parser that parses a <code class="languageplaintext highlighterrouge">+</code> or a <code class="languageplaintext highlighterrouge"></code> and then parses another term might look like this: <code class="languageplaintext highlighterrouge">((pSymbol "+" $> Add) <> (pSymbol "" $> Sub)) <*> pTerm</code>. It discards the symbol it parsed and instead returns the value constructor of the expression that belongs to that symbol. Then it applies the expression parsed by the <code class="languageplaintext highlighterrouge">pTerm</code> on the right to that value constructor. But there is an problem here though! The term that’s applied to the value constructor first is the righthand side of the binary expression. But the first parameter of the value constructor is defined to be the lefthand side. We need to <code class="languageplaintext highlighterrouge">flip</code> the parameters of the value constructor.</p>
<div class="languagehaskell highlighterrouge"><div class="highlight"><pre class="highlight"><code><span class="c1"> Expr.hs</span>
<span class="kr">import</span> <span class="nn">Data.Functor</span> <span class="p">((</span><span class="o">$></span><span class="p">))</span>
<span class="c1"> [...]</span>
<span class="n">pExpr</span> <span class="o">::</span> <span class="kt">Parser</span> <span class="kt">Expr</span>
<span class="n">pExpr</span> <span class="o">=</span> <span class="kr">do</span>
<span class="c1"> lhs :: Expr</span>
<span class="n">lhs</span> <span class="o"><</span> <span class="n">pTerm</span>
<span class="c1"> rhs :: Expr > Expr</span>
<span class="n">rhs</span> <span class="o"><</span> <span class="n">flip</span> <span class="o"><$></span> <span class="n">pOperator</span> <span class="o"><*></span> <span class="n">pTerm</span>
<span class="n">pure</span> <span class="o">$</span> <span class="n">rhs</span> <span class="n">lhs</span>
<span class="kr">where</span>
<span class="n">pOperator</span> <span class="o">=</span> <span class="p">(</span><span class="n">pSymbol</span> <span class="s">"+"</span> <span class="o">$></span> <span class="kt">Add</span><span class="p">)</span> <span class="o"><></span> <span class="p">(</span><span class="n">pSymbol</span> <span class="s">""</span> <span class="o">$></span> <span class="kt">Sub</span><span class="p">)</span>
</code></pre></div></div>
<p>Let’s try it out again.</p>
<pre><code class="languageplain">λ :l Expr
[1 of 2] Compiling Main ( Expr.hs, interpreted )
Ok, one module loaded.
λ parseTest (pExpr <* eof) "92 * 4"
1:7:

1  92 * 4
 ^
unexpected end of input
expecting '+', '', or digit
</code></pre>
<p>It doesn’t work yet, because we’re missing the <em>zero or more repetitions</em> part. For this, <a href="https://hackage.haskell.org/package/base4.19.0.0/docs/ControlApplicative.html#v:many"><code class="languageplaintext highlighterrouge">many</code></a> can be used, which will run the given parser zero or more times and return a list of all results. In our case, it returns a list of <code class="languageplaintext highlighterrouge">Expr > Expr</code>. A left fold can be used to apply the functions in this list to another, starting with <code class="languageplaintext highlighterrouge">lhs</code>. This will build the desired leftassociative tree of expressions.</p>
<div class="languagehaskell highlighterrouge"><div class="highlight"><pre class="highlight"><code><span class="c1"> Expr.hs</span>
<span class="c1"> [...]</span>
<span class="n">pTerm</span> <span class="o">::</span> <span class="kt">Parser</span> <span class="kt">Expr</span>
<span class="n">pTerm</span> <span class="o">=</span> <span class="kr">do</span>
<span class="n">lhs</span> <span class="o"><</span> <span class="n">pFactor</span>
<span class="n">rhs</span> <span class="o"><</span> <span class="n">many</span> <span class="o">$</span> <span class="n">flip</span> <span class="o"><$></span> <span class="n">pOperator</span> <span class="o"><*></span> <span class="n">pFactor</span>
<span class="n">pure</span> <span class="o">$</span> <span class="n">foldl</span> <span class="p">(</span><span class="nf">\</span><span class="n">expr</span> <span class="n">f</span> <span class="o">></span> <span class="n">f</span> <span class="n">expr</span><span class="p">)</span> <span class="n">lhs</span> <span class="n">rhs</span>
<span class="kr">where</span>
<span class="n">pOperator</span> <span class="o">=</span> <span class="p">(</span><span class="n">pSymbol</span> <span class="s">"*"</span> <span class="o">$></span> <span class="kt">Mul</span><span class="p">)</span> <span class="o"><></span> <span class="p">(</span><span class="n">pSymbol</span> <span class="s">"/"</span> <span class="o">$></span> <span class="kt">Div</span><span class="p">)</span>
<span class="n">pExpr</span> <span class="o">::</span> <span class="kt">Parser</span> <span class="kt">Expr</span>
<span class="n">pExpr</span> <span class="o">=</span> <span class="kr">do</span>
<span class="n">lhs</span> <span class="o"><</span> <span class="n">pTerm</span>
<span class="n">rhs</span> <span class="o"><</span> <span class="n">many</span> <span class="o">$</span> <span class="n">flip</span> <span class="o"><$></span> <span class="n">pOperator</span> <span class="o"><*></span> <span class="n">pTerm</span>
<span class="n">pure</span> <span class="o">$</span> <span class="n">foldl</span> <span class="p">(</span><span class="nf">\</span><span class="n">expr</span> <span class="n">f</span> <span class="o">></span> <span class="n">f</span> <span class="n">expr</span><span class="p">)</span> <span class="n">lhs</span> <span class="n">rhs</span>
<span class="kr">where</span>
<span class="n">pOperator</span> <span class="o">=</span> <span class="p">(</span><span class="n">pSymbol</span> <span class="s">"+"</span> <span class="o">$></span> <span class="kt">Add</span><span class="p">)</span> <span class="o"><></span> <span class="p">(</span><span class="n">pSymbol</span> <span class="s">""</span> <span class="o">$></span> <span class="kt">Sub</span><span class="p">)</span>
</code></pre></div></div>
<p>Now it works! I formatted the GHCI output a bit so it’s easy to recognize that the trees in the output match those from the beginning of this post.</p>
<pre><code class="languageplain">λ :l Expr
[1 of 2] Compiling Main ( Expr.hs, interpreted )
Ok, one module loaded.
λ parseTest (pExpr <* eof) "92 * 4"
Mul (Num 92) (Num 4)
λ parseTest (pExpr <* eof) "7 + 42 * 9"
Add
(Num 7)
(Mul
(Num 42)
(Num 9))
λ parseTest (pExpr <* eof) "2 * 3 / 4 * 5"
Mul
(Div
(Mul
(Num 2)
(Num 3))
(Num 4))
(Num 5)
λ parseTest (pExpr <* eof) "8 * (10  6)"
Mul
(Num 8)
(Sub
(Num 10)
(Num 6))
</code></pre>
<h1 id="conclusion">Conclusion</h1>
<p><code class="languageplaintext highlighterrouge">pTerm</code> and <code class="languageplaintext highlighterrouge">pExpr</code> are very similar and can easily be abstracted into a function that parses any leftassociative binary expression. Then, the production rule for any level of precedence can be implemented in a single line. Unary operators can also be added by extending <code class="languageplaintext highlighterrouge">pFactor</code>.</p>
<p>The code for this post can be found <a href="https://github.com/d4ckard/blogcode/blob/3b8ea340e94f97c6892f92f64091f876c94b3993/20231031parsingexpressionsbyrecursivedescentinhaskell/Expr.hs">here</a>. It includes such a generic function for parsing expressions.</p>
<p>I hope this post managed to convey my enthusiasm about the elegance of paring by recursive descent in Haskell! Feel free to reach out (<a href="mailto:">email</a>) if you find this interesting too, or if you have any questions/suggestions about the implementation presented here.</p>
<hr />
<div class="footnotes" role="docendnotes">
<ol>
<li id="fn:1" role="docendnote">
<p>I used <a href="https://q.uiver.app/">Quiver</a> to create the diagrams. It has an option to embed diagrams as Iframes, but I decided not to, because I like how reliable and simple plain images are. <a href="#fnref:1" class="reversefootnote" role="docbacklink">↩</a></p>
</li>
<li id="fn:2" role="docendnote">
<p>The curly braces denote zero or more repititons of what’s inside them. A character in quotes refers to that literal character. The <code class="languageplaintext highlighterrouge">num</code> production rule/token is not included in the grammar. It refers to a numeric literal. <a href="#fnref:2" class="reversefootnote" role="docbacklink">↩</a></p>
</li>
</ol>
</div>
The Root of the Dependency Tree20230917T00:00:00+00:00https://thasso.xyz/2023/09/17/therootofthedependencytree<p>The hobby debugger I am working on, <a href="https://github.com/d4ckard/spray">Spray</a>, features custom syntax highlighting of C source code. To implement this, I had to recursively parse all the type definitions in the current source file and in its dependencies.</p>
<p><a href="https://eli.thegreenplace.net/2011/05/02/thecontextsensitivityofcsgrammarrevisited">C is not a contextfree language</a>, which leads to the socalled typedefname problem <sup id="fnref:1" role="docnoteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>. The problem is that <code class="languageplaintext highlighterrouge">typedef</code> can be used to make types look like regular identifiers. This creates some situations where context is needed to determine whether the given identifier is a type. Since types and identifiers should be highlighted with different colors, I had to get that context.</p>
<p>While slowly iterating on the logic required to solve this problem, I got to a point where I could inspect the entire public dependency tree of all the header files included in a single source file. Header files are all I need to worry about here, since that is where all public type definitions live.</p>
<p>This revealed some pretty interesting patterns, some of which I have already shared on <a href="https://twitter.com/d4kdd/status/1692816506898333712?s=20">Twitter/X</a>. What I found most interesting is that basically every program you will every write<sup id="fnref:2" role="docnoteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup> will somehow include the <code class="languageplaintext highlighterrouge">bits/wordsize.h</code> header file. Here is what it looks like on my machine:</p>
<div class="languagec highlighterrouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/* Determine the wordsize from the preprocessor defines. */</span>
<span class="cp">#if defined __x86_64__ && !defined __ILP32__
# define __WORDSIZE 64
#else
# define __WORDSIZE 32
#define __WORDSIZE32_SIZE_ULONG 0
#define __WORDSIZE32_PTRDIFF_LONG 0
#endif
</span>
<span class="cp">#ifdef __x86_64__
# define __WORDSIZE_TIME64_COMPAT32 1
</span><span class="cm">/* Both x8664 and x32 use the 64bit system call interface. */</span>
<span class="cp"># define __SYSCALL_WORDSIZE 64
#else
# define __WORDSIZE_TIME64_COMPAT32 0
#endif
</span></code></pre></div></div>
<p>As you can see, it simply defines the word size of the host processor. This information is then used all over the C standard library.</p>
<p>So, if you want your code to be used by the largest number of developers possible, contributing to this file is a great way to start!</p>
<hr />
<div class="footnotes" role="docendnotes">
<ol>
<li id="fn:1" role="docendnote">
<p>Actually, I haven’t gotten around to implementing <code class="languageplaintext highlighterrouge">typedef</code>s yet. For now it would be sufficient to scan for <code class="languageplaintext highlighterrouge">struct</code>, <code class="languageplaintext highlighterrouge">enum</code>, etc. to check if something is type. <a href="#fnref:1" class="reversefootnote" role="docbacklink">↩</a></p>
</li>
<li id="fn:2" role="docendnote">
<p>At least if you’re working on a Linux (and possibly other UNIX too) machine. But where else would one every write software, right? <a href="#fnref:2" class="reversefootnote" role="docbacklink">↩</a></p>
</li>
</ol>
</div>
An Intuition for Logarithms20230902T00:00:00+00:00https://thasso.xyz/2023/09/02/anintuitionforlogarithms<p>This post was originally inspired by <a href="https://twowrongs.com/learningsomelogarithms.html">learning some logarithms</a> by kqr. Just like his, post it’s about the excitement of discovery, and I hope to present some of the knowledge that I have gathered in an approachable way. Some parts are quite mathheavy, but they are not meant to be a formally rigid study of the topics at hand. Rather, their purpose is to be insightful and practical.</p>
<p>I have become fascinated by the ability to calculate logarithmic functions in one’s head. To me, logarithms have always felt like a black box that couldn’t be conquered. They are a fundamental building block of mathematics. Yet every time I saw a logarithmic equation, I was quick to grab my calculator instead of solving the equation by hand. Over the last half year, I have spent some time improving my understanding of logarithms and learning how to compute the results of logarithmic equations by hand. Here is how I did it.</p>
<h1 id="whylearnthis">Why learn this?</h1>
<p>To me, the ability of computing logarithms purely by hand is greatly desirable. The number of concepts we can hold in <a href="https://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus_or_Minus_Two">working memory</a> at any point in time is limited, so it makes sense to internalize as many conceptual building blocks as possible. With a good intuition for logarithmic expressions, you’ll be infinitely more comfortable dealing with equations involving logarithms and you’ll be able to deal with a level of complexity that would have been unthinkable to you before. Also they are going to be less daunting or distracting to you when they appear in some other context.</p>
<p>On another note, I think there is a lot to say in favor of the aesthetic beauty of performing computationally complex operations without any tools except for systematic analytic thought.</p>
<p>Firstly, why is it that we don’t all find logarithms to be intuitive? To answer this question, I found it helpful to think of mathematical operations in terms of the <em>function</em> they compute and that function’s <em>inverse</em>. For example, the addition $y = x + a$ yields the value $y$ from $x$ and $a$ and the inverse of the addition results in one of the arguments: $y  a = x$. You can find a table of different examples of this for numerous functions and the restrictions that apply <a href="https://en.wikipedia.org/wiki/Inverse_function#Standard_inverse_functions">here on Wikipedia</a>. You will notice, that at any level of complexity the inverse function is generally more difficult to compute in your head or by hand than the original. The inverse as a standalone function is usually a more abstract operation than it’s counterpart that’s even harder to grasp as the complexity of the base operation increases. For example, a subtraction is often harder to compute manually than an addition and a multiplication is usually harder than both addition and subtraction.</p>
<p>For a long time, the limit of what I could realistically compute in my head – or rather <em>approximate</em> – was square roots. This was likely due to the fact that in German highschool they make us memorize the square and thereby also the square root of all numbers between 0 and 20. Later, this became the starting point for learning about parabolas, polynomials and so on.</p>
<p>I can’t recall whether I felt bored during those introductory classes, but I am certain many people disliked endlessly performing simple calculations applying the same operation over and over again. The method is somewhat boring, but it seems to me that memorizing a number of common function values and then drilling their relationships is useful in building a base from which to construct a deeper understanding and improved intuition. It’s also essential to have some examples memorized if you want to approximate the values of complex functions in a timely manner without using a calculator.</p>
<p>Earlier on, the same approach was used for addition, subtraction, multiplication, and division. The general idea was introduced first, and then it was drilled for a while to get a feel for it. This is definitely not the ultimate learning strategy <sup id="fnref:1" role="docnoteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>, but it does highlight the importance of getting to a place where you don’t have to think about the basic operations anymore. They are just there in your head, right at your fingertips.</p>
<p>For example, what’s the square root of $17$? Not knowing how to algorithmically compute the best approximation of $\sqrt{17}$, we can make a pretty good <em>intuitive</em> guess by relying on our knowledge of square roots. We memorized that $\sqrt{16} = 4$, that the square of the next integer after $4$ is $5^1 = 25$ and that the function $f(x) = x^1$ is continuous. Based on this knowledge, we can quickly say that $\sqrt{17}$ must be somewhere in the lower part of the interval $[4, 5]$.</p>
<p>This gets us pretty far, but let’s keep going. To improve our guess without spending any time on (generally) random guesses, we can approximate $f(x)$ for $4 \leq x \leq 5$ as a linear function. Based on this, we can conclude that since $25  16 = 9$ and $\sqrt{25}  \sqrt{16} = 5  4 = 1$, $\sqrt{17} = \sqrt{16 + 1} \approx 4 + \frac{1}{9}$. All of these calculations can be easily performed in one’s head or quickly be scribbled down on a piece of paper. The figure below illustrates what I have just explained. Again, note that this doesn’t involve any calculations that are difficult to do by hand.</p>
<p><img src="/public/figures/exampleapproximatesquareroot.png" alt="Figure 1: Approximating square roots" /></p>
<p>I hope you can see why this is so exciting! By simply memorizing of a few sample values and by internalizing the relationships of the operations we need through drilling, it’s possible to make very fast guesses about otherwise complex computations. To prove my point, let’s find out how close we got. $4 + \frac{1}{9} \approx 4.111$ and $\sqrt{17} = 4.123$: our guess was off by only $0.012$. Imagine if you could approximate something like $\log_{10}(64)$ just as quickly!</p>
<h1 id="powersrootsandlogarithms">Powers, roots and logarithms</h1>
<p>There is <a href="https://math.stackexchange.com/questions/3693149/isntsquarerootabitlikelog">some confusion</a> about the relationship between powers, roots, and logarithms. Some common operations such as addition and multiplication are commutative, unlike exponentiation, which is not. Commutativity means that $a + b = b + a$ and $a \cdot b = b \cdot a$: we can freely exchange the order of the operands. This is not the case with powers. Generally, $a^b$ is not the same as $b^a$ <sup id="fnref:2" role="docnoteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>. We can show that exponentiation cannot be commutative by choosing an example that doesn’t fulfill this property: $3^4 = 81 \neq 4^3 = 64$.</p>
<p>The fact that <a href="https://en.wikipedia.org/wiki/Commutative_property#Division,_subtraction,_and_exponentiation">exponentiation is not commutative</a> makes it possible to define two methods of undoing exponentiation: the nthroot and the logarithm. This may seem surprising, since we’re so used to commutative operations like addition and multiplication that have only a single inverse operation.</p>
<p>Suppose we are given an exponentiation $x = b^n$ where $b$ and $x$ are positive real numbers, $b \neq 1$, and $n$ is a positive integer. If we know the values of $n$ and $x$, the nthroot gives the value of $b = \sqrt[n]{x}$. Otherwise, if we know the values of $b$ and $x$, the logarithm of $x$ to the base $b$ gives the values of the exponent $n = \log_b(x)$.</p>
<p>Both the nthroot and the logarithm take the result of an exponentiation as input. Simply put, what makes them different is the missing piece they fill in. Logarithms can be used to calculate the exponent from the result value and the base. The nthroot, on the other hand, takes the result value and the original exponent, to compute the base.</p>
<p>The following figure helped me personally to better understand the relationship between the three functions.</p>
<p><img src="/public/figures/relationshippowerrootlogarithm.png" alt="Figure 2: Relationship between powers, roots and logarithms" /></p>
<p>It comes from a German mathematics textbook called Handbuch Mathematik<sup id="fnref:3" role="docnoteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup> which translates to <i>Handbook Mathematics</i>. The caption reads: “<i>Relationship between power, root and logarithm or between value of power, base and exponent</i>”. This illustration shows the <em>triangular relationship</em> between the three functions, which is key. The same idea is used in <a href="https://math.stackexchange.com/a/165225">this answer on Stack Exchange</a> to suggest an alternative notation for powers, roots, and logarithms that emphasizes this relationship.</p>
<h1 id="definitionoflogarithms">Definition of logarithms</h1>
<p>Say we have an exponentiation $x = b^y$ where $y$, $b$ and $x$ are real numbers. From now on $b$ and $x$ are positive (excluding 0) and $b \neq 1$. For such an exponentiation, the logarithm of $x$ to the base $b$ <a href="https://www.alamo.edu/contentassets/3c031ab72f3d4dbda979bc9e66d11634/exponential/math1414logarithmeticfunctions.pdf" title="Definition of the logarithmic function as given by Crystal Hull">is defined as</a></p>
\[\log_b(x) = y ~~ \Leftrightarrow ~~ b^y = x\text{.}\]
<p>In the exponentiation $x = b^y$, if we know $x$ and $b$, the logarithm of $x$ to the base $b$ will give us the missing exponent. From this definition we get the following identities <sup id="fnref:4" role="docnoteref"><a href="#fn:4" class="footnote" rel="footnote">4</a></sup>:</p>
\[\text{(1)} ~~~ x = b^{\log_b(x)} ~~~~~~ \text{(2)} ~~~ x = \log_b({b^x})\text{.}\]
<p>Both of these are crucial. Ideally, you’d want them to come to mind every time you’re looking to solve a logarithmic equation. While they represent fundamental relationships of the logarithmic and exponential form, their <em>reverseness</em> makes them difficult to think about.</p>
<p>Let’s start by digesting $x = b^{\log_b(x)}$. To break up the equation, we’ll call the logarithm in the exponent $a = \log_b(x)$. This means that, $a$ is the exponent to the base $b$ such that $b^a = x$. With this step of indirection, it’s easy to see why the equation must be true. Think of it like this: “Raise $b$ to the power $a$ of $b$ that satisfies the property that $b^a = x$.”.</p>
<p>The second equation $x = \log_b(b^x)$ is easier to understand. I like to ask myself the question: “What’s the exponent $x$ to the base $b$, such that $b^x = b^x$? It’s $x$!”.</p>
<p>Based on this knowledge alone, we are also able to induce the values of a few logarithmic expressions. For example, what is the value of $\log_a(a)$? Since $a = a^1$, we can rewrite this as $\log_a(a^1)$ or equally $\log_a(a^x)$ with $x = 1$. Now, it’s obvious from the second identity that the answer is 1. The same approach of relying on the laws of exponents works for $\log_a(1)$. If we rewrite this in the same way by substituting $a^0$ for $1$, we get $\log_a(a^0) = 0$.</p>
<h2 id="multiplicationasaddition">Multiplication as addition</h2>
<p>The following is another central identity that opens up a lot of possibilities for us when it comes to computing logarithms manually.</p>
\[\log_b(x \cdot y) = \log_b(x) + \log_b(y) ~~~~ \text{if} ~ y > 0\]
<p>The proof can be constructed from the two identities above and the rules of exponents. We start by using the first identity $x = b^{\log_b(b)}$ to substitute $x$ and $y$ in the left side of the equation:</p>
\[\log_b(x \cdot y) = \log_b(b^{\log_b(x)} \cdot b^{\log_b(y)})\text{.}\]
<p>Using the law of exponents which states that $x^a \cdot x^b = x^{a + b}$, we can simplify this into</p>
\[\log_b(b^{\log_b(x)} \cdot b^{\log_b(y)}) = \log_b(b^{\log_b(x) + \log_b(y)}) \text{.}\]
<p>Finally, the second identity $x = \log_b({b^x})$ is used to simplify this even further:</p>
\[\log_b(b^{\log_b(x) + \log_b(y)}) = \log_b(x) + \log_b(y) \text{.}\]
<p>John Napier is said to have introduced this equation in 1614, and it quickly became famous because it allowed <a href="https://en.wikipedia.org/wiki/Mathematical_table#Tables_of_logarithms">reducing complex multiplications to simple additions</a>. Instead of performing the multiplication itself, people could simply look up the values of $\log_b(x)$ and $\log_b(y)$ and then add them together. This is one of the techniques we’ll use later to solve logarithmic equations by hand. Instead of looking up the values in a logarithm table, we’ll memorize a few key ones.</p>
<h2 id="divisionassubtraction">Division as subtraction</h2>
<p>The laws of exponents also state that $x^a / x^b = x^{a  b}$. Hence, the above equation also works for division:</p>
\[\log_b(x / y) = \log_b(x)  \log_b(y) ~~~~ \text{if} ~ y > 0 \text{.}\]
<h2 id="exponentiationasmultiplication">Exponentiation as multiplication</h2>
<p>There is another law of exponents which states that $(x^a)^b = x^{a \cdot b}$. Based on this, we get the following mindboggling equation:</p>
\[\log_b(x^c) = c \cdot \log_b(x)\]
<p>This can be proved by first defining an auxiliary variable $y = \log_b(x)$, so that $b^y = x$ (first identity). By substituting $b^y$ for $x$, we get \(\log_b((b^y)^c)\). Next, we apply this law of exponents:</p>
\[\log_b((b^y)^c) = \log_b(b^{y \cdot c})\text{.}\]
<p>Lastly, we again apply the second identity from the definition and the substitute the original definition of $y$:</p>
\[\log_b(b^{y \cdot c}) = y \cdot c = c \cdot \log_b(x)\text{.}\]
<h2 id="changeofbasis">Change of basis</h2>
<p>This is the last concept we need to get started calculating logarithms by hand. The ability to change the base of a logarithmic expression is very valuable to us, because it means that we only have to remember a relatively small set of values for a single base. We’re then able to convert between bases and thereby solve expressions in a variety of bases.</p>
\[\log_b(x) = \frac{\log_a(x)}{\log_a(b)} ~~~~ \text{if} ~ a > 0\]
<p>Again, we’ll start by defining a variable $y = \log_b(x)$. It’s important to recall that this is only used to simplify subsequent equations and make them easy to grasp. We can transform the left side of the equation into the exponential form.</p>
\[\log_b(x) = y ~~ \Leftrightarrow ~~ b^y = x\]
<p>From here, we continue by taking the logarithm of both sides to the desired base $a$. Like $b$, $a$ can be any positive real number except $1$.</p>
\[\begin{align*}
b^y &= x & &  ~ \log_a\\
\log_a(b^y) &= \log_a(x)
\end{align*}\]
<p>We can further transform this equation using the <em>‘exponentiation as multiplication’</em> identity.</p>
\[\begin{align*}
\log_a(b^y) &= \log_a(x)\\[2pt]
y \cdot \log_a(b) &= \log_a(x)
\end{align*}\]
<p>The final step is to isolate the variable $y$ and to substitute the original $\log_b(x)$ for it.</p>
\[\begin{align*}
y \cdot \log_a(b) &= \log_a(x) & &  ~ \div \log_a(b)\\[2pt]
y &= \frac{\log_a(x)}{\log_a(b)}\\[2pt]
\log_b(x) &= \frac{\log_a(x)}{\log_a(b)}
\end{align*}\]
<p>It might be a good idea to make flash cards for yourself to memorize the different identities or <em>rules</em> we can use to transform logarithmic equations. Don’t put too much on the same card. Instead, try to spread out the different insights over several cards.</p>
<p>Theorywise that’s all I am going to cover in this post. Obviously, there is a lot more to know about logarithms, which I have left out here for the sake of brevity. That said, all of this gives us a good foundation to expand on in the future. More importantly, it is enough for us to calculate the values of many different logarithmic expressions.</p>
<h1 id="memorizealogarithmtable">Memorize a logarithm table</h1>
<p>As mentioned above, logarithms were a huge breakthrough when they were first discovered, because they made complicated calculations relatively simple. In a time before machine computers, this was very valuable. The values of many different logarithmic expressions were calculated once and then collected into socalled <a href="https://en.wikipedia.org/wiki/Mathematical_table#Tables_of_logarithms">logarithm tables</a>. After transforming a given problem, you could look up the closest value in the table, and you’d have a pretty good estimate.</p>
<p><img src="https://thasso.xyz/public/figures/logarithmtablehenrybriggs.jpg" alt="Figure 3: Early table of logarithms by Henry Briggs from 1617" title="Composite image of two pages of Henry Briggs' Logarithmorum Chilias Prima. Source of the images: http://www.pmonta.com/tables/logarithmorumchiliasprima/index.html" /></p>
<p>Of course, memorizing that many numbers would be a bit extreme. It’s not impossible, and it will certainly improve the accuracy of your approximations, but it’s not worth the time for most people. Therefore, we will reduce the set of values to memorize to the bare minimum.</p>
<p>We can change the base of any logarithm to a base we know if we know the value of the logarithm of the original base to the desired base. This means that we only need to memorize a range of logarithms to a single base, as well as some logarithms of other bases to the same base we’ve chosen. Here, I opted for base $10$ logarithms and conversions from the natural and binary logarithms, since they are the most <em>common</em>. The range that we need to memorize is also quite small, since multiplications can be broken up into additions. That is, it’s sufficient to know the values of $\log_{10}(8)$ and $\log_{10}(10)$ to calculate $\log_{10}(80)$ because</p>
\[\log_{10}(80) = \log_{10}(10 \cdot 8) = \log_{10}(10) + \log_{10}(8) \text{.}\]
<p>Depending on how hard you want to make it for yourself, you can also choose different degrees of precision. As suggested by <a href="https://twowrongs.com/learningsomelogarithms.html">kqr</a>, it’s possible to get quite satisfactory results with only a single digit of precision. For completeness, I also added a higher precision option for each value. It makes sense to memorize the higher precision values for the logarithms that you’ll use most often. Therefore, the default precision for $\log_{10}(e)$ and $\log_{10}(2)$ is slightly higher.</p>
<style>
.good {
backgroundcolor: #dcedc8 !important;
}
.gradientrow td {
border: 0px solid #fff;
}
.gradientrow td {
first: #FFFDE7;
second: #E8F5E9;
third: #E1F5FE;
fourth: #EDE7F6;
}
.none {
backgroundcolor: #fff !important;
}
.noneintofirst {
background: lineargradient(to right, #fff 80%, var(first));
}
.firstintosecond {
background: lineargradient(to right, var(first) 80%, var(second));
}
.second {
backgroundcolor: var(second) !important;
}
.secondintothird {
background: lineargradient(to right, var(second) 80%, var(third));
}
.third {
backgroundcolor: var(third) !important;
}
.thirdintofourth {
background: lineargradient(to right, var(third) 80%, var(fourth));
}
.fourth {
backgroundcolor: var(fourth) !important;
}
</style>
<! Markdown table source (good for making edits):
 logarithm  base precision  base error  extra precision  extra precision error 
 $\log_{10}(1)$  0  /  /  / 
 $\log_{10}(10)$  1  /  /  / 
 $\log_{10}(e)$  0.43  1.00 %  0.4343  0.00 % 
 $\log_{10}(2)$  0.30  0.34 %  0.3010  0.01 % 
 $\log_{10}(3)$  0.5  4.80 %  0.477  0.03 % 
 $\log_{10}(4)$  0.6  0.34 %  0.602  0.01 % 
 $\log_{10}(5)$  0.7  0.15 %  0.699  0.00 % 
 $\log_{10}(6)$  0.8  2.81 %  0.778  0.02 % 
 $\log_{10}(7)$  0.8  5.34 %  0.845  0.01 % 
 $\log_{10}(8)$  0.9  0.34 %  0.903  0.01 % 
 $\log_{10}(9)$  1.0  4.80 %  0.954  0.03 % 
>
<table class="prettytable">
<tbody>
<tr>
<td>logarithm</td>
<td>base precision</td>
<td>base error</td>
<td>extra precision</td>
<td>extra precision error</td>
</tr>
<tr>
<td>$\log_{10}(1)$</td>
<td class="good">0</td>
<td>/</td>
<td>/</td>
<td>/</td>
</tr>
<tr>
<td>$\log_{10}(10)$</td>
<td class="good">1</td>
<td>/</td>
<td>/</td>
<td>/</td>
</tr>
<tr>
<td>$\log_{10}(e)$</td>
<td>0.43</td>
<td>1.00 %</td>
<td class="good">0.4343</td>
<td>0.00 %</td>
</tr>
<tr>
<td>$\log_{10}(2)$</td>
<td>0.30</td>
<td class="good">0.34 %</td>
<td>0.3010</td>
<td>0.01 %</td>
</tr>
<tr>
<td>$\log_{10}(3)$</td>
<td>0.5</td>
<td>4.80 %</td>
<td class="good">0.477</td>
<td>0.03 %</td>
</tr>
<tr>
<td>$\log_{10}(4)$</td>
<td class="good">0.6</td>
<td>0.34 %</td>
<td>0.602</td>
<td>0.01 %</td>
</tr>
<tr>
<td>$\log_{10}(5)$</td>
<td class="good">0.7</td>
<td>0.15 %</td>
<td>0.699</td>
<td>0.00 %</td>
</tr>
<tr>
<td>$\log_{10}(6)$</td>
<td>0.8</td>
<td>2.81 %</td>
<td class="good">0.778</td>
<td>0.02 %</td>
</tr>
<tr>
<td>$\log_{10}(7)$</td>
<td>0.8</td>
<td>5.34 %</td>
<td class="good">0.845</td>
<td>0.01 %</td>
</tr>
<tr>
<td>$\log_{10}(8)$</td>
<td class="good">0.9</td>
<td>0.34 %</td>
<td>0.903</td>
<td>0.01 %</td>
</tr>
<tr>
<td>$\log_{10}(9)$</td>
<td>1.0</td>
<td>4.80 %</td>
<td class="good">0.954</td>
<td>0.03 %</td>
</tr>
</tbody>
</table>
<p>As you can see, for each logarithmic expression there is a value with an error of less than 1 % (the one in green). Personally, I have chosen to memorize the following sequence of values. I find them relatively easy to remember because of their regularities. The downside of this is that some of the values have an error greater than 1 %. They also highlight a key characteristic of logarithmic growth, namely that the input value has to increase by some factor for the output value to increase by a constant amount<sup id="fnref:5" role="docnoteref"><a href="#fn:5" class="footnote" rel="footnote">5</a></sup>.</p>
<table class="prettytable">
<tbody>
<tr>
<td colspan="2">$x$</td>
<td colspan="2">$e$</td>
<td colspan="2">1</td>
<td colspan="2">2</td>
<td colspan="2">3</td>
<td colspan="2">4</td>
<td colspan="2">5</td>
<td colspan="2">6</td>
<td colspan="2">7</td>
<td colspan="2">8</td>
<td colspan="2">9</td>
<td colspan="2">10</td>
</tr>
<tr>
<td colspan="2">$\log_{10}(x)$</td>
<td colspan="2">0.4343</td>
<td colspan="2">0.0</td>
<td colspan="2">0.3</td>
<td colspan="2">0.5</td>
<td colspan="2">0.6</td>
<td colspan="2">0.7</td>
<td colspan="2">0.8</td>
<td colspan="2">0.85</td>
<td colspan="2">0.9</td>
<td colspan="2">0.95</td>
<td colspan="2">1.0</td>
</tr>
<tr class="gradientrow">
<td colspan="2" class="none"></td>
<td colspan="3" class="noneintofirst"></td>
<td colspan="2" class="firstintosecond">$\underset{+0.3}{⤻}$</td>
<td colspan="2" class="secondintothird">$\underset{+0.2}{⤻}$</td>
<td colspan="2" class="third">$\underset{+0.1}{⤻}$</td>
<td colspan="2" class="third">$\underset{+0.1}{⤻}$</td>
<td colspan="2" class="thirdintofourth">$\underset{+0.1}{⤻}$</td>
<td colspan="2" class="fourth">$\underset{+0.05}{⤻}$</td>
<td colspan="2" class="fourth">$\underset{+0.05}{⤻}$</td>
<td colspan="2" class="fourth">$\underset{+0.05}{⤻}$</td>
<td colspan="2" class="fourth">$\underset{+0.05}{⤻}$</td>
</tr>
</tbody>
</table>
<p>Again, I’d recommend you to create some flash cards for yourself to memorize these values. If you don’t care about precision that much, you can make it easier for yourself by using the more regular values in the lower table. Otherwise, use the upper table to pick and choose.</p>
<h1 id="thehardpart">The hard part</h1>
<p>Now we’ve got most things covered. The last piece that’s missing is getting it all into your brain forever. The best way to start is to look up a bunch of exercises and work through them. For example, I found <a href="https://math.colorado.edu/math1300/resources/Exercises_LogarithmicFunction.pdf">this one</a> early on.</p>
<p>I said at the beginning of this post that you’d learn how to quickly calculate $\log_{10}(64)$ by hand. Now that we have a good idea of what we’re dealing with, this shouldn’t be too difficult.</p>
<p>Let’s think: we need to break up $64$ into an expression made up of logarithms, we know. This is not difficult, for example $64 = 8 \cdot 8$. The rest is easy:</p>
\[\log_{10}(64) = \log_{10}(8 \cdot 8) = \log_{10}(8) + \log_{10}(8) \approx 0.9 + 0.9 = 1.8\text{.}\]
<p>Just writing this, I am filled with a rush of excitement and awe. Let’s see how close we got this time. $\log_{10}(64) = 1.8062$ which means we were off by only $0.0062$! We won’t get this close for all numbers and many expressions are more difficult than this one. But to me, it’s really amazing how easy this is.</p>
<p>Above, I talked about the historical use of logarithms to speed up multiplications of large factors. Now we can use this technique as well. <a href="https://math.stackexchange.com/a/2296546">This answer</a> on Stack Exchange is a good example of how to do so. Using logarithms in this way has become somewhat obsolete now that we have so many computers. But if you are interested in how to do this, you should definitely read the answer.</p>
<p>Finally, I want to show how we can use our newfound knowledge to solve for any exponent. For example, let’s look at the following equation: $0.5^x = 0.1$. Here, we have an unknown in the exponent. By definition, we need to use logarithms here.</p>
\[\begin{align*}
0.5^x &= 0.1 &  ~~ \log_{0.5}\\[2pt]
x &= \log_{0.5}(0.1)
\end{align*}\]
<p>This time, we need to change the logarithm’s base to 10. After doing so, we are left with values which we have memorized.</p>
\[\begin{align*}
x &= \log_{0.5}(0.1)\\[2pt]
x &= \frac{\log_{10}(0.1)}{\log_{10}(0.5)}\\[2pt]
x &= \frac{\log_{10}(1 / 10)}{\log_{10}(1 / 2)}\\[2pt]
x &= \frac{\log_{10}(1)  \log_{10}(10)}{\log_{10}(1)  \log_{10}(2)}\\[2pt]
x &\approx \frac{0  1}{0  0.3}\\[2pt]
x &\approx \frac{1}{0.3}\\[2pt]
x &\approx 3.33
\end{align*}\]
<h1 id="conclusion">Conclusion</h1>
<p>With this post, I hope to share some of the joyful moments of insight I felt as I delved deeper into understanding logarithms. Here, I have focused mainly on how to deal with logarithmic equations and how to solve them manually. There is much more to know about logarithms, about their history, their modern use, and especially their connection to the exponential function. Still, I have tried to provide comprehensive explanations, and I hope you found them insightful.</p>
<p><a href="https://thasso.xyz/about">Feel free to contact me</a> if you have any questions or suggestions. I’m always curious about your feedback. Also, if you find any mistakes, I would be grateful if you could point them out so I can correct them.</p>
<hr />
<! Place the share notice before the footnotes, since most people
are not going to scroll past the footnotes >
<! LocalWords: frac Cornelsen Mit nd Sussman
>
<div class="footnotes" role="docendnotes">
<ol>
<li id="fn:1" role="docendnote">
<p>For one, many students dislike this sort of learning so much that they completely lose interest in math. <a href="#fnref:1" class="reversefootnote" role="docbacklink">↩</a></p>
</li>
<li id="fn:2" role="docendnote">
<p>In fact there are only <a href="https://keithmcnulty.medium.com/onlyonepairofdistinctintegerssatisfythisequation76ea45469a96">two distinct numbers n and m</a> that fulfill the requirement that $n^m = m^n$ which are $2^4 = 4^1 = 16$. <a href="#fnref:2" class="reversefootnote" role="docbacklink">↩</a></p>
</li>
<li id="fn:3" role="docendnote">
<p>Scholl, W., & Drews, R. (1997). <cite>Handbuch Mathematik</cite>. Falken. <a href="#fnref:3" class="reversefootnote" role="docbacklink">↩</a></p>
</li>
<li id="fn:4" role="docendnote">
<p>Knuth, E. (1997). <cite>The art of computer programming: Fundamental algorithms</cite> (3rd ed., Vol. 1). Addison Wesley Longman Publishing Co., Inc. <a href="#fnref:4" class="reversefootnote" role="docbacklink">↩</a></p>
</li>
<li id="fn:5" role="docendnote">
<p>Abelson, H., Sussman, G. J., & Sussman, J. (1996). <cite>Structure and Interpretation of Computer Programs</cite> (2nd ed.). Mit Press. 1.2.3 Orders of Growth <a href="#fnref:5" class="reversefootnote" role="docbacklink">↩</a></p>
</li>
</ol>
</div>
Can you use a class in C?20230811T00:00:00+00:00https://thasso.xyz/2023/08/11/canyouuseaclassinc<p>Recently, I’ve been <a href="https://github.com/d4ckard/spray">working on a C debugger</a>. This requires reading and processing the DWARF debugging information that’s part of the binary. Since this is a rather complex task, I figured I might use a library that exports a nice interface to the debugging information.</p>
<p>One such library that I found early on was <a href="https://github.com/aclements/libelfin">libelfin</a>. It wasn’t perfect from that start because it is a bit dated now, only supporting DWARF 4 and missing features from the newer DWARF 5 standard, but I thought that I could work around this. The bigger problem was that libelfin is written in C++ while most the debugger is written in C.</p>
<p>It is pretty easy to call code written in C from C++ since a lot of C is still part of the subset of C that C++ supports. The problem with calling C++ code from C is that there are many features in C++ that C is missing. This means that the C++ interface must be simplified for C to be able to understand it.</p>
<h1 id="handlingobjects">Handling objects</h1>
<p>The most important concept in C++ that C is missing is true object orientation. That is, in C <a href="https://eev.ee/blog/2013/03/03/thecontrollerpatternisawfulandotherooheresy/">you don’t get a this pointer for free</a>; you need to handle it manually.</p>
<p>Let’s start with a simple example. Say we have a class that represents a <a href="https://en.wikipedia.org/wiki/Rational_number">rational number</a> $r = p / q$ where $q \neq 0$. The declaration without any of the operations we need might look something like this, which will print <code class="languageplaintext highlighterrouge">5 / 3</code> when we run it.</p>
<div class="languagec++ highlighterrouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// rational.h</span>
<span class="k">class</span> <span class="nc">Rational</span> <span class="p">{</span>
<span class="nl">public:</span>
<span class="kt">int</span> <span class="n">_numer</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">_denom</span><span class="p">;</span>
<span class="n">Rational</span><span class="p">(</span><span class="kt">int</span> <span class="n">numer</span><span class="p">,</span> <span class="kt">int</span> <span class="n">denom</span><span class="p">)</span>
<span class="o">:</span> <span class="n">_numer</span><span class="p">{</span><span class="n">numer</span><span class="p">},</span> <span class="n">_denom</span><span class="p">{</span><span class="n">denom</span><span class="p">}</span> <span class="p">{}</span>
<span class="p">};</span>
</code></pre></div></div>
<p>This is how we might use it in C++:</p>
<div class="languagec++ highlighterrouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// main.cc</span>
<span class="cp">#include <iostream>
#include "rational.h"
</span>
<span class="k">auto</span> <span class="nf">main</span><span class="p">()</span> <span class="o">></span> <span class="kt">int</span> <span class="p">{</span>
<span class="k">auto</span> <span class="n">r</span> <span class="o">=</span> <span class="n">Rational</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">3</span><span class="p">);</span>
<span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o"><<</span> <span class="n">r</span><span class="p">.</span><span class="n">_numer</span> <span class="o"><<</span> <span class="s">" / "</span> <span class="o"><<</span> <span class="n">r</span><span class="p">.</span><span class="n">_denom</span> <span class="o"><<</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
<span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="err">}</span>
</code></pre></div></div>
<p>How do you write this as a C program using the <code class="languageplaintext highlighterrouge">Rational</code> class? After all, there is no such thing as a class in C. To solve this issue we can rely on one of the primitives that most systems languages have in common by virtue of running to the same type of computer: the pointer. We will allocate an instance of our class on the heap and then give the C program a pointer to that instance. This way we can keep track of the object to manipulate it. It’s also possible to use handles for this, but they are just <a href="https://floooh.github.io/2018/06/17/handlesvspointers.html">pointers with extra steps</a> and a bit overkill for us at this point.</p>
<p>The following is what we might want.</p>
<div class="languagec highlighterrouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// main.c</span>
<span class="cp">#include <stdio.h>
#include "rational.h"
</span>
<span class="kt">int</span> <span class="nf">main</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span> <span class="p">{</span>
<span class="kt">void</span> <span class="o">*</span><span class="n">r</span> <span class="o">=</span> <span class="n">make_rational</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">3</span><span class="p">);</span>
<span class="n">printf</span><span class="p">(</span><span class="s">"%d / %d</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">get_numer</span><span class="p">(</span><span class="n">r</span><span class="p">),</span> <span class="n">get_denom</span><span class="p">(</span><span class="n">r</span><span class="p">));</span>
<span class="n">del_rational</span><span class="p">(</span><span class="o">&</span><span class="n">r</span><span class="p">);</span>
<span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>We need to extend our interface with all the new functions to construct, access and manually delete instances of <code class="languageplaintext highlighterrouge">Rational</code>.</p>
<div class="languagec++ highlighterrouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// rational.h</span>
<span class="k">class</span> <span class="nc">Rational</span> <span class="p">{</span> <span class="cm">/* ... */</span> <span class="p">};</span>
<span class="kt">void</span> <span class="o">*</span><span class="nf">make_rational</span><span class="p">(</span><span class="kt">int</span> <span class="n">numer</span><span class="p">,</span> <span class="kt">int</span> <span class="n">denom</span><span class="p">);</span>
<span class="kt">int</span> <span class="nf">get_numer</span><span class="p">(</span><span class="k">const</span> <span class="kt">void</span> <span class="o">*</span><span class="n">r</span><span class="p">);</span>
<span class="kt">int</span> <span class="nf">get_denom</span><span class="p">(</span><span class="k">const</span> <span class="kt">void</span> <span class="o">*</span><span class="n">r</span><span class="p">);</span>
<span class="kt">void</span> <span class="nf">del_rational</span><span class="p">(</span><span class="kt">void</span> <span class="o">**</span><span class="n">rp</span><span class="p">);</span>
</code></pre></div></div>
<div class="languagec++ highlighterrouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// rational.cc</span>
<span class="cp">#include "rational.h"
#include <cstdlib>
</span>
<span class="kt">void</span> <span class="o">*</span><span class="nf">make_rational</span><span class="p">(</span><span class="kt">int</span> <span class="n">numer</span><span class="p">,</span> <span class="kt">int</span> <span class="n">denom</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// Allocate an instance on the heap.</span>
<span class="n">Rational</span> <span class="o">*</span><span class="n">r</span> <span class="o">=</span> <span class="k">static_cast</span><span class="o"><</span><span class="n">Rational</span><span class="o">*></span><span class="p">(</span><span class="n">malloc</span><span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="n">Rational</span><span class="p">)));</span>
<span class="n">r</span><span class="o">></span><span class="n">_numer</span> <span class="o">=</span> <span class="n">numer</span><span class="p">;</span>
<span class="n">r</span><span class="o">></span><span class="n">_denom</span> <span class="o">=</span> <span class="n">denom</span><span class="p">;</span>
<span class="k">return</span> <span class="n">r</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">int</span> <span class="nf">get_numer</span><span class="p">(</span><span class="k">const</span> <span class="kt">void</span> <span class="o">*</span><span class="n">r</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// Cast to access members.</span>
<span class="k">const</span> <span class="n">Rational</span> <span class="o">*</span><span class="n">_r</span> <span class="o">=</span> <span class="k">static_cast</span><span class="o"><</span><span class="k">const</span> <span class="n">Rational</span><span class="o">*></span><span class="p">(</span><span class="n">r</span><span class="p">);</span>
<span class="k">return</span> <span class="n">_r</span><span class="o">></span><span class="n">_numer</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">int</span> <span class="nf">get_denom</span><span class="p">(</span><span class="k">const</span> <span class="kt">void</span> <span class="o">*</span><span class="n">r</span><span class="p">)</span> <span class="p">{</span>
<span class="k">const</span> <span class="n">Rational</span> <span class="o">*</span><span class="n">_r</span> <span class="o">=</span> <span class="k">static_cast</span><span class="o"><</span><span class="k">const</span> <span class="n">Rational</span><span class="o">*></span><span class="p">(</span><span class="n">r</span><span class="p">);</span>
<span class="k">return</span> <span class="n">_r</span><span class="o">></span><span class="n">_denom</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">del_rational</span><span class="p">(</span><span class="kt">void</span> <span class="o">**</span><span class="n">rp</span><span class="p">)</span> <span class="p">{</span>
<span class="n">Rational</span> <span class="o">*</span><span class="n">_r</span> <span class="o">=</span> <span class="k">static_cast</span><span class="o"><</span><span class="n">Rational</span><span class="o">*></span><span class="p">(</span><span class="o">*</span><span class="n">rp</span><span class="p">);</span>
<span class="c1">// Delete the instance on the heap.</span>
<span class="n">free</span><span class="p">(</span><span class="n">_r</span><span class="p">);</span>
<span class="c1">// Delete the dangling pointer too.</span>
<span class="o">*</span><span class="n">rp</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>The trick is to <strong>allocate instances on heap and then pass them around as <code class="languageplaintext highlighterrouge">void</code> pointers</strong>. We use C’s <code class="languageplaintext highlighterrouge">malloc</code> instead of the <code class="languageplaintext highlighterrouge">new</code> operator because the <code class="languageplaintext highlighterrouge">new</code> operator is a C++ only feature which raises a linker error. A good way to improve type safety is to <code class="languageplaintext highlighterrouge">typedef</code> an opaque type to represent the class on the C side, as suggested in <a href="https://github.com/d4ckard/blogcode/issues/1#issue1848643298">this reply</a>. This is the approach that we’ll be using later on, so keep on reading. Alternatively, if you have control over all of the C++ code (i.e. you don’t just wrap a library) you could follow <a href="https://stackoverflow.com/a/7281477">this Stack Overflow answer</a> too.</p>
<p>Now, ignoring how incredibly unsafe all of this is, there is a bigger problem we must face: this is not even close to compiling!
The reason for this is that when we <code class="languageplaintext highlighterrouge">#include "rational.h"</code> into <code class="languageplaintext highlighterrouge">main.c</code>, we essentially copy all the contents of <code class="languageplaintext highlighterrouge">rational.h</code> into the C source file. This means that we suddenly present the C compiler with a class declaration and other things that it doesn’t understand because they are part of a totally different language.</p>
<p>We can use the C preprocessor to help us here. Using the <code class="languageplaintext highlighterrouge">__cplusplus</code> macro, we can check whether to include the C++ parts in the interface. This way it’s hidden from the C compiler but available to the C++ compiler.</p>
<div class="languagec++ highlighterrouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// rational.h</span>
<span class="cp">#ifdef __cplusplus
</span><span class="k">class</span> <span class="nc">Rational</span> <span class="p">{</span>
<span class="nl">public:</span>
<span class="kt">int</span> <span class="n">_numer</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">_denom</span><span class="p">;</span>
<span class="n">Rational</span><span class="p">(</span><span class="kt">int</span> <span class="n">numer</span><span class="p">,</span> <span class="kt">int</span> <span class="n">denom</span><span class="p">)</span>
<span class="o">:</span> <span class="n">_numer</span><span class="p">{</span><span class="n">numer</span><span class="p">},</span> <span class="n">_denom</span><span class="p">{</span><span class="n">denom</span><span class="p">}</span> <span class="p">{}</span>
<span class="p">};</span>
<span class="cp">#endif // __cplusplus
</span>
<span class="c1">// ...</span>
</code></pre></div></div>
<p>Using the two different compilers to build, the program could look like this: <code class="languageplaintext highlighterrouge">g++ c rational.cc && gcc main.c rational.o</code>.</p>
<p>Great it compiles! But uhh … now the linker signals an error. There are two problems left to fix. Firstly C++ uses a different <a href="https://en.wikipedia.org/wiki/Application_binary_interface">ABI</a> than C which means that the calling convention is different. Additionally, C++ compilers mangle the names of identifiers in the source code differently than C compilers do, so the linker can’t find them. Fortunately, C is the <em>lingua franca</em> of computer programming so C++ compilers can adapt their behavior in both of these aspects to that of C compilers. To do so, we just <strong>prefix all C++ declarations that should be used by C code with <code class="languageplaintext highlighterrouge">extern "C"</code></strong>.</p>
<p>This is very simple to do in the <code class="languageplaintext highlighterrouge">rational.cc</code> source file, but requires some extra smartness in <code class="languageplaintext highlighterrouge">rational.h</code>. Again, <code class="languageplaintext highlighterrouge">extern "C"</code> is only a C++ feature, so it cannot be part of the header when the C compiler is looking at it. The solution to this is to use the <code class="languageplaintext highlighterrouge">__cplusplus</code> macro once more.</p>
<div class="languagec++ highlighterrouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// rational.h</span>
<span class="cp">#ifdef __cplusplus
</span><span class="k">class</span> <span class="nc">Rational</span> <span class="p">{</span> <span class="cm">/* ... */</span> <span class="p">};</span>
<span class="cp">#endif // __cplusplus
</span>
<span class="cp">#ifdef __cplusplus
</span><span class="k">extern</span> <span class="s">"C"</span> <span class="p">{</span>
<span class="cp">#endif // __cplusplus
</span>
<span class="kt">void</span> <span class="o">*</span><span class="n">make_rational</span><span class="p">(</span><span class="kt">int</span> <span class="n">numer</span><span class="p">,</span> <span class="kt">int</span> <span class="n">denom</span><span class="p">);</span>
<span class="kt">int</span> <span class="n">get_numer</span><span class="p">(</span><span class="k">const</span> <span class="kt">void</span> <span class="o">*</span><span class="n">r</span><span class="p">);</span>
<span class="kt">int</span> <span class="n">get_denom</span><span class="p">(</span><span class="k">const</span> <span class="kt">void</span> <span class="o">*</span><span class="n">r</span><span class="p">);</span>
<span class="kt">void</span> <span class="n">del_rational</span><span class="p">(</span><span class="kt">void</span> <span class="o">**</span><span class="n">rp</span><span class="p">);</span>
<span class="cp">#ifdef __cplusplus
</span><span class="p">}</span> <span class="c1">// extern "C"</span>
<span class="cp">#endif // __cplusplus
</span>
</code></pre></div></div>
<p>This wraps all of the function definitions in an <code class="languageplaintext highlighterrouge">extern "C"</code> block when the C++ compiler is looking at it. After making those changes to <code class="languageplaintext highlighterrouge">rational.h</code> and <code class="languageplaintext highlighterrouge">rational.cc</code> we get the following output.</p>
<div class="languagesh highlighterrouge"><div class="highlight"><pre class="highlight"><code>g++ <span class="nt">c</span> rational.cc
gcc main.c rational.o
./a.out
5 / 3
</code></pre></div></div>
<p>We successfully created a class in C++ that we can now use in C!</p>
<p>Now that we have covered how to use the preprocessor to change the content of a file based on the compiler that’s looking at it, we can make the API a bit safer, too. To do that we create an opaque type that acts a proxy for the <code class="languageplaintext highlighterrouge">Rational</code> class on the C side. By only declaring this type, the C compiler will ensure that the pointers passed around in the interface are all of the same type (i.e. <code class="languageplaintext highlighterrouge">Rational</code>). However, it won’t let you dereference the pointers because the type is never really defined.</p>
<div class="languagec++ highlighterrouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#ifdef __cplusplus
</span>
<span class="k">class</span> <span class="nc">Rational</span> <span class="p">{</span>
<span class="c1">// ...</span>
<span class="p">};</span>
<span class="cp">#else
</span>
<span class="c1">// Opaque type as a C proxy for the class.</span>
<span class="k">typedef</span> <span class="k">struct</span> <span class="nc">Rational</span> <span class="n">Rational</span><span class="p">;</span>
<span class="cp">#endif // __cplusplus
</span></code></pre></div></div>
<p>In addition to that we now replace all <code class="languageplaintext highlighterrouge">void *</code> with <code class="languageplaintext highlighterrouge">Rational *</code>. This will allow you to remote some of the <code class="languageplaintext highlighterrouge">static_cast</code>s from the beginning.</p>
<h1 id="linkingthecstandardlibrary">Linking the C++ standard library</h1>
<p>Above, we used <code class="languageplaintext highlighterrouge">malloc</code> and a cast to allocate the instance of <code class="languageplaintext highlighterrouge">Rational</code> to prevent a linker error later on. If we had used <code class="languageplaintext highlighterrouge">new</code> and <code class="languageplaintext highlighterrouge">delete</code> instead (which is the proper C++ way), we would have gotten linker errors like this one: <code class="languageplaintext highlighterrouge">rational.cc:(.text+0x15): undefined reference to `operator new(unsigned long)'</code>. Usually in a C++ program, this issue doesn’t arise because <code class="languageplaintext highlighterrouge">new</code> and <code class="languageplaintext highlighterrouge">delete</code> are provided in the C++ standard library. The problem is that we used a C compiler to build the executable, which doesn’t link the C++ standard library by default. The solution is to <strong>pass the linker flag <code class="languageplaintext highlighterrouge">lstdc++</code> to the compiler explicitly</strong>.</p>
<p>With <code class="languageplaintext highlighterrouge">new</code> we can also use normal C++ constructors, making everything more concise and safe:</p>
<div class="languagec++ highlighterrouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// rational.cc</span>
<span class="cp">#include "rational.h"
</span>
<span class="k">extern</span> <span class="s">"C"</span> <span class="n">Rational</span> <span class="o">*</span><span class="nf">make_rational</span><span class="p">(</span><span class="kt">int</span> <span class="n">numer</span><span class="p">,</span> <span class="kt">int</span> <span class="n">denom</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// Now we're using the constructor.</span>
<span class="n">Rational</span> <span class="o">*</span><span class="n">r</span> <span class="o">=</span> <span class="k">new</span> <span class="n">Rational</span><span class="p">(</span><span class="n">numer</span><span class="p">,</span> <span class="n">denom</span><span class="p">);</span>
<span class="k">return</span> <span class="n">r</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// ...</span>
<span class="k">extern</span> <span class="s">"C"</span> <span class="kt">void</span> <span class="nf">del_rational</span><span class="p">(</span><span class="n">Rational</span> <span class="o">**</span><span class="n">rp</span><span class="p">)</span> <span class="p">{</span>
<span class="k">delete</span> <span class="o">*</span><span class="n">rp</span><span class="p">;</span>
<span class="o">*</span><span class="n">rp</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<h1 id="handlingexceptions">Handling exceptions</h1>
<p>Exceptions are another feature of C++ that C doesn’t have. If the C++ code we wrapped throws an exception, the whole program will crash without doing any cleanup. This can be addressed in multiple ways, one of which is to pass <code class="languageplaintext highlighterrouge">fnoexceptions</code> to the C++ compiler to abort if a library throws an exception and to reject code that uses exceptions. The more realistic and safe approach is to carefully <strong>catch all exceptions at the language boundary</strong>.</p>
<p>If you take another look at the definition of rational numbers above, you’ll notice that we don’t actually ensure that $q \neq 0$. This will become problematic if we try to implement rational number arithmetic for our class. We’ll address this by throwing an exception in the constructor if the denominator is 0.</p>
<div class="languagec++ highlighterrouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// rational.h</span>
<span class="cp">#ifdef __cplusplus
</span>
<span class="cp">#include <stdexcept>
</span>
<span class="k">class</span> <span class="nc">Rational</span> <span class="p">{</span>
<span class="nl">public:</span>
<span class="kt">int</span> <span class="n">_numer</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">_denom</span><span class="p">;</span>
<span class="n">Rational</span><span class="p">(</span><span class="kt">int</span> <span class="n">numer</span><span class="p">,</span> <span class="kt">int</span> <span class="n">denom</span><span class="p">)</span> <span class="p">{</span>
<span class="k">this</span><span class="o">></span><span class="n">_numer</span> <span class="o">=</span> <span class="n">numer</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">denom</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
<span class="k">throw</span> <span class="n">std</span><span class="o">::</span><span class="n">domain_error</span><span class="p">(</span><span class="s">"denominator is 0"</span><span class="p">);</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="k">this</span><span class="o">></span><span class="n">_denom</span> <span class="o">=</span> <span class="n">denom</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">};</span>
<span class="cp">#endif // __cplusplus
</span>
<span class="c1">// ...</span>
</code></pre></div></div>
<p>Since we know now that the constructor might throw, we catch all exceptions in the wrapper and return a <code class="languageplaintext highlighterrouge">nullptr</code> in case of an exception. In general, it’s often a good idea to catch anything and return a generic error value such as null. In addition to that, you could add infinitely more complex errorhandling schemes at the language boundary.</p>
<div class="languagec++ highlighterrouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// rational.cc</span>
<span class="cp">#include "rational.h"
</span>
<span class="k">extern</span> <span class="s">"C"</span> <span class="n">Rational</span> <span class="o">*</span><span class="nf">make_rational</span><span class="p">(</span><span class="kt">int</span> <span class="n">numer</span><span class="p">,</span> <span class="kt">int</span> <span class="n">denom</span><span class="p">)</span> <span class="p">{</span>
<span class="k">try</span> <span class="p">{</span>
<span class="c1">// Allocate an instance on the heap.</span>
<span class="n">Rational</span> <span class="o">*</span><span class="n">r</span> <span class="o">=</span> <span class="k">new</span> <span class="n">Rational</span><span class="p">(</span><span class="n">numer</span><span class="p">,</span> <span class="n">denom</span><span class="p">);</span>
<span class="k">return</span> <span class="n">r</span><span class="p">;</span>
<span class="p">}</span> <span class="k">catch</span> <span class="p">(...)</span> <span class="p">{</span>
<span class="k">return</span> <span class="nb">nullptr</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>In such a simple case it’s also feasible to check if the denominator is 0 in <code class="languageplaintext highlighterrouge">make_rational</code> but that doesn’t apply to more realistic examples.</p>
<p>You can find all the code for this post <a href="https://github.com/d4ckard/blogcode/tree/main/20230811canyouuseaclassinc">on my GitHub</a>.</p>
<h1 id="conclusion">Conclusion</h1>
<p>I ended up not using libelfin for my debugger, but I am glad that I had this opportunity to learn so much about calling C++ code from C. This is the first time that I documented any of the insights I discovered about a particular problem, and I am excited to find out what you think about it. Feel free to contact me through my <a href="https://thasso.xyz/about">about page</a>. Your insights and perspectives would be greatly appreciated. I am committed to write more post like this one in the future and I hope you found it helpful ^^.</p>