Semantics & Static Type Analysis of Python Programs


Python is an increasingly popular programming language, used a lot in the scientific community, and well known for its powerful and permissive high-level syntax. Python is an object-oriented, interpreted, dynamic programming language, meaning for example that variables are not typechecked before execution and metaprogramming features such as introspection are widely used. In particular, errors such as undeclared variables and type incompatibilities are detected at runtime by the interpreter and raised as exceptions (which can be caught by the program itself).

The goal of this PhD is to develop semantical static analyses for Python, i.e. to detect automatically uncaught exceptions without running the programs. We have developed an analysis inferring the types of each value which is necessary to detect these exceptions. To perform such an analysis, and to prove properties on this analysis, we need to know the semantics of Python programs, which will be the focus of this talk. Python efficient and concise syntax entails the semantics to be as accommodating as possible rather than raise exceptions, creating many behaviors for operators as simple as the addition. We will however see that interactions between those accomodating behaviors may result in some unexpected results, before defining the input-output semantics of some Python operators. Then, we will present how we check that our semantics is correct using our static analysis, and compare our approach with other methods. We will finish by discussing the issues of having one reference interpreter defining the behavior of Python programs, compared to having a standard defining the language.