Symbol Table
Symbol Table(符号表)
Wikipedia
In computer science, a symbol table is a data structure used by a language translator such as a compiler or interpreter, where each identifier (or symbols), constants, procedures and functions in a program’s source code is associated with information relating to its declaration or appearance in the source. In other words, the entries of a symbol table store the information related to the entry’s corresponding symbol.
在计算机科学中,符号表是语言转换器(如:编译器或解释器)使用的一种数据结构,其中程序源码中的每个标识符(或符号)、常数、过程和函数都与其在源码中的声明或appearance相关的信息相关联。换句话说,符号表的条目存储了与该条目对应的符号相关的信息。
A compiler may use one large symbol table for all symbols or use separated, hierarchical symbol tables for different scopes.
编译器可以对所有符号使用一个大的符号表,也可以对不同的作用域使用分离的、分层的符号表。
A common data structure used to implement symbol tables is the hash table.
用于实现符号表的常见数据结构是哈希表。
As the lexical analyser spends a great proportion of its time looking up the symbol table, this activity has a crucial effect on the overall speed of the compiler. A symbol table must be organised in such a way that entries can be found as quickly as possible.
由于词法分析器要花费大量的时间查找符号表,因此这个活动对编译器的总体速度有至关重要的影响。符号表必须以这样一种方式组织,条目可以尽快找到。
Applications
An object file will contain a symbol table of the identifiers it contains that are externally visible. During the linking of different object files, a linker will identify and resolve these symbol references. Usually all undefined external symbols will be searched for in one or more object libraries. If a module is found that defines that symbol it is linked with together with the first object file, and any undefined external identifiers are added to the list of identifiers to be looked up. This process continues until all external references have been resolved. It is an error if one or more remains unresolved at the end of the process.
一个目标文件将包含一个符号表,它所包含的标识符是对外可见的。在链接不同的目标文件期间,链接器将识别和解析这些符号引用。通常,所有未定义的外部符号会在一个或者多个目标库中搜索。如果找到定义该符号的模块,则将其与第一个目标文件链接在一起,并且将所有未定义的外部标识符添加到要查找的标识符列表中。此过程将继续进行,直到所有外部引用都已解析完毕。如果在该过程结束时,有一个或者多个仍未找到,则被认为出错。
While reverse engineering an executable, many tools refer to the symbol table to check what addresses have been assigned to global variables and known functions. If the symbol table has been stripped or cleaned out before being converted into an executable, tools will find it harder to determine addresses or understand anything about the program.
在对可执行文件进行逆向工程时,许多工具会参考符号表来检查给全局变量和已知函数分配了哪些地址。如果符号表在转换为可执行文件之前已经被剥离或清除,那么工具将很难确定地址或理解有关程序的任何内容。
Example
// Declare an external function
extern double bar(double x);
// Define a public function
double foo(int count)
{
double sum = 0.0;
// Sum all the values bar(1) to bar(count)
for (int i = 1; i <= count; i++)
sum += bar((double) i);
return sum;
}
A C compiler that parses this code will contain at least the following symbol table entries:
解析此代码的C编译器将至少包含以下符号表项:
Symbol name | Type | Scope |
---|---|---|
bar | function, double | extern |
x | double | function parameter |
foo | function, double | global |
count | int | function parameter |
sum | double | block local |
i | int | for-loop statement |
In addition, the symbol table will also contain entries generated by the compiler for intermediate expression values (e.g., the expression that casts the i loop variable into a double, and the return value of the call to function bar()), statement labels, and so forth.
此外,符号表还将包含由编译器为中间表达式值(例如,将i循环变量转换为double类型的表达式,以及调用函数bar()的返回值)、语句标签等生成的条目。