牛骨文教育服务平台(让学习变的简单)

4 Lex

lex.py是用来将输入字符串标记化。例如,假设你正在设计一个编程语言,用户的输入字符串如下:

x = 3 + 42 * (s - t)

标记器将字符串分割成独立的标记:

"x","=", "3", "+", "42", "*", "(", "s", "-", "t", ")"

标记通常用一组名字来命名和表示:

"ID","EQUALS","NUMBER","PLUS","NUMBER","TIMES","LPAREN","ID","MINUS","ID","RPAREN"

将标记名和标记值本身组合起来:

("ID","x"), ("EQUALS","="), ("NUMBER","3"),("PLUS","+"), ("NUMBER","42), ("TIMES","*"),("LPAREN","("), ("ID","s"),("MINUS","-"),("ID","t"), ("RPAREN",")

正则表达式是描述标记规则的典型方法,下一节展示如何用lex.py实现。