One of the first things you learn to do with Ruby is to evaluate a mathematical expression in `irb`

. But what’s the magic behind this seemingly trivial operation? Let’s create our own evaluation engine and find out!

Note: You could cheat and just use

`eval`

, but that would defeat the point of this exercise.

[Tweet “”Let’s create our own evaluation engine in Ruby” via @matugm”]

## Tokenizing Our Input

We are given a string as input, but what we need are numbers and operators. We can use the StringScanner class and ‘tag’ the different parts of the input so we can recognize them.

This is the main method of the `Tokenizer`

class:

```
def parse(input)
@buffer = StringScanner.new(input)
until @buffer.eos?
skip_spaces
read_tokens
end
@tokens
end
```

In this method, I made a `StringScanner`

object with the input string as an argument, then I have a loop which keeps going until the end of the string is reached. Inside this loop, I call two methods: `skip_spaces`

and `read_tokens`

.

The `skip_spaces`

method is very simple:

```
def skip_spaces
@buffer.skip(/\s+/)
end
```

The `/\s+/`

part is a regular expression that matches spaces (including tabs and newlines). Then we have `read_tokens`

, which is where most of the work happens:

```
RULES.each do |regex, type|
token = @buffer.scan(regex)
if token
@tokens << Token.new(type, token)
end
end
```

In this method, we try to match part of the input string against one of our parsing rules, which are just regular expressions. When a match is found, we create a new token object with the type and value. For example, for the number ‘5’ we would have `Token.new(:int, 5)`

.

```
RULES = {
/\d+/ => :int,
/[\/+\-*]/ => :op
}
```

And that’s all we need to convert our input string into tokens. Now let’s see how we can turn those tokens into something more useful.

hbspt.cta.load(1169977, ‘964db6a6-69da-4366-afea-b129019aff07’, {});

## The Shunting Yard Algorithm

Now that we have our tokens, we need to convert them into a format we can work with. The normal format for a math expression is called ‘infix’. The problem with this format is that it’s hard to evaluate because of operation precedence. There is another way to represent a mathematical expression: the ‘postfix’ notation.

**Examples:**

Infix ___________ | Postfix |
---|---|

3 + 4 | 3 4 + |

9 * 4 | 9 4 * |

2 * 5 + 1 | 2 5 * 1 + |

We are going to use the Shunting-yard algorithm to convert from the `infix`

notation to `postfix`

. The way this algorithm works is by using two stacks, one for **numbers** and another for **operators**.

Follow these steps:

- Read a token.
- If it’s a number, put it in the numbers stack.
- If it’s an operator, put it in the operators stack, but first, check the precedence of the operator on the top of the stack. Move it into the numbers stack if the precedence is higher or equal.
- When you’ve read all the tokens, pop all the operators and put them into the numbers stack.
- Done!

Let’s take a look at the code, starting with the `initialize`

method:

```
def initialize(tokens)
@tokens = tokens
@output = []
@operators = []
end
```

In this method, we do our initial setup by using the tokens from the previous phase (tokenization) and creating two arrays that will serve as our stacks. Let’s continue our exploration of the code:

```
@tokens.each do |token|
push_number(token) if token.type == :int
process_operator(token) if token.type == :op
end
```

Inside the `run`

method, we iterate over our tokens array and put them in the correct stacks.

```
def operator_priority_is_less_or_equal_than(token)
@operators.last && (token.priority <= @operators.last.priority)
end
```

This method helps us handle operator precedence. The `priority`

is a method defined in the `Token`

class.

```
def priority
return 1 if value == '+' || value == '-'
return 2 if value == '*' || value == '/'
end
```

When we’re done, we just dump the remaining operators into our output stack.

`@output += @operators.reverse`

We need to call `reverse`

here. Putting things into a stack and taking them out reverses them, but in this case, we want the original order.

And that’s all it takes to convert to postfix notation (also know as Reverse Polish Notation or RPN).

## Evaluation

Now that we have our expression in postfix notation, it’s very easy to evaluate using a single stack.

**Algorithm**:

- We put all the numbers we find on the numbers stack.
- Whenever we find an operator, we pop two numbers from the stack.
- Then we calculate the result and put the result back on the top of the stack.
- Repeat until done. Last number will be the final result.

```
tokens.each do |token|
@numbers << token if token.type == :int
if token.type == :op
right_num = @numbers.pop
left_num = @numbers.pop
raise 'Invalid postfix expression' unless right_num && left_num
result = evaluate(right_num.value, left_num.value, token)
@numbers << Token.new(:int, result)
end
end
```

This is the main loop for the evaluation method. Once we find an operation, we call the `evaluate`

method and create a new token object with the results.

```
def evaluate(right_num, left_num, operation)
case operation.value
when '+' then left_num + right_num
when '-' then left_num - right_num
when '*' then left_num * right_num
when '/' then left_num / right_num
end
end
```

There is nothing special about this method; we just apply the operation and return the result.

`@numbers.last.value`

Once all the operators have been processed, we’ll have one token left on our numbers stack. This token is the final result.

## Conclusion

In this post, you learned about tokenization, which lets you break a string into its component values. This is what most programming languages (like Ruby or JavaScript) do behind the scenes to your source code so they can turn it into computer instructions.

Then you learned about two different ways to arrange a mathematical expression, ‘infix’ and ‘postfix’, and how to convert between them by using the ‘Shunting-Yard algorithm’. You’ve also learned about the ‘stack’ data structure and what it can be useful for.

I hope you enjoyed this thought exercise! You can find the final code here: https://github.com/matugm/math-eval

[Tweet “”How to Build a Math Evaluation Engine” via @matugm”]

Alex Popov says

Thank you very much for the super neatly written article. It is well worth the read!

aimen sasi says

all of this is fine, but this is a best case scenario. What if the input included wrong number of operators, for example (A + B * (4 * / 5)). is there a way to validate that the expression is right.