Programming languages are categorised as either static typed ,or dynamic typed, and the differences may actually surprise, as they go deeper than it may seem at first.
Contents:
- Dynamic Type Variables: The Obvious, but Superficial Difference
- Dynamic Types: A Deeper Difference
- Strange Behaviours
- Behaviour Summary: Total flexibiliy, total uncertainty
- Static Types
- Compile Time vs Run Time
- Kotlin Example
- Restrictions: An Advantage?
- Beating Restrictions: Dynamic to Static
- Converting Code
- Same Function – Different Parameter Types
- Variables that change type
- Dictionaries/Maps as Objects
- Dynamic Classes
- Dynamic Objects
- Dynamic Types: Benefits
- Practical Usage?
- Shorthand?
- Static Types: Benefits?
- Restrictions = certainty
- Tooling
- Performance
- Code Readability
- Static Types In Python
- The efficiency impact
- Most Programs: No Impact
- Special Usage: A Killer
- Coding Impact
- Telling The Compiler
- Ducks AnyOne?
- Conclusion
Dynamic Type Variables: The Obvious, but more Superficial Difference
When I asked a few people about static vs dynamic typed languages, the common answer was that in dynamic languages, variables do not need an associated specific type. With a dynamic typed language such as Python, a variable ‘foo’ can have no type, and could first be an int
and then later a string
, so the type of ‘foo’ could be considered dynamic.
The implications of Type Free Variables are less significant than the implications of Dynamic Types, but the effects still go beyond that which is immediately obvious.
Consider the following sample code:
foo = 4 # the type is determined by the value foo ="hello" # foo has no type itself, so now hold string foo2:int = 4 # Python itself does not check the type notation foo2 = "hello" # so foo2 can still reference a string def isOdd(b_is_int, data): if b_is_int: print("data is {}".format("odd" if data%2 else "even") else: print("length of data is {}.format("odd" if data.len()%2 else "even") if(b_is_int == None): data.non_existant_function() def isOdd2(a): if isinstance(a, int): print("a is {}".format("odd" if a%2 else "even") else: print("length of a is {}.format("odd" if a.len()%2 else "even")
With Python, foo
can be set to any type at any time. In Python (currently at least), we can even declare that foo2
will be an int
, and still set foo2 to a string
. Declaring the type is for external type checking only, and has no impact on Python itself and is outside the Python language.
Changing the type referenced by a variable such as foo
is inconsequential when the code is all in the one source file. A more interesting case is to consider a function may be in another file and imported to the calling code. This puts the call and function in separate files and the calling code may more easily mistake what is expected by the function.
The isOdd()
function in the Python example allows int
operations (eg % for modulus) or string
operations (len() function) on the parameter data
, and does not require checking the type before trying those operations. Clearly isOdd() would generate an exception for cases such as: isOdd(True, “abc”), which would fail with a not all arguments converted during string formatting
exception or isOddd(False, 4), which would fail with an object of type int has no len()
exception. These exceptions happen only at run time, as given the language allows any values for the two parameters, it is not possible to know what will be allowed until attempting the code with actual values. without the code performing any tests to see if the or string
to be carried out on
var foo = 3 // foo takes type from initial value, so Int a = "abc" // not permitted foo must always be an Int var foo2:Any = 3 // force type to any b = "abc" // permitted, but b must be cast be it will be useful fun isOdd(a:Any){ when(a){ is Int -> println("a is ${if (a%2==1) "odd" else "even"}") is String -> println("size a is ${if (a.length%2==1) "odd" else "even"}") }
Dynamic Types: A Deeper Difference
Introduction: Strange Behaviours.
Digging deeper, a more complex difference is that types themselves are dynamic. This means that a type does not have a fixed set of attributes. That type itself can change. So at one point in the code a type has one set of attributes, but at another point in the code, the attibrutes of the same type could have changed. Two objects of the same type need not have the same attributes! Or, for example, the requred parameters the same method for two object of the same type could be different.
# note. output lines begin with ... to avoid formating problems class Sample: def __init__(self,a): self.a = a def test(self): print(self.a) s1=Sample(1) s2=Sample(2) s1.test == s2.test # are the two test functions the same? ...False # verify both s1 and s2 work as expected s1.test() ... 1 s2.test() ... 2 def new_test(self): print("new",self.a) Sample.new_test = Sample.test # add a new_test() to the class s1.new_test() # members now have a new function that was not there before ... new 1 s2.new_test() ... new 2 # new test instance attributes s1.test = s1.test # copy class attribute to instance attribute del Sample.test #remove test() from class definition s1.test() # s1 is not changed .... 1 s2.test() # but s2 is, so now s1 and s2 have different attributes! .... AttributeError: 'Sample' object has no attribute 'test' # define a method as a closure def closure_wrapper(obj): def own_test(): print(obj.a + 1) return own_test # and create instance attribute s2.own_test = closure_wrapper(s2) s2.own_test() ... 3
This code creates a class “Sample”, and then two instance of “Sample”: “s1” and “s2”. At creation, s1 and s2 have the same attributes. Both have an ‘a’ attribute, although as an instance value, each has its own instance so “a'” is 1 and 2 for s1 and s2 respectively, but they both have an attribute ‘a’.
With methods attributes, the normal expectation is that not only will the same method attributes be present for all instances of the class, but those methods will acutally be the same methods.
The “test’ methods for s1 and s1 do not pass an equals test, and neither is equal to the Sample.test, as the values for s1.test and s2.test are closures with respective objects. The code does run the same for each object, as expected.
The first ‘dynamic’ aspect of the sample is revealed by the assignment Sample.new_test=
which adds a new method to the class dynamically. The new_test method is now available to all members of the class, so dynamically at run time, the class has one set of attributes at one time, and another at any time. A function being passed a parameter of type Sample
cannot with certainty be sure if the method new_test
is present or not.
Next comes instance attributes. The fact that each instance of an object can have different values for an attribute is universal to object oriented. By with dynamic languages, different instances of the same class can have different attributes, not just different values for attributes, but different attributes.
The statement ‘s1.test = s1.test ‘ looks to set something to what it already is, but in python every attribute can come multiple sources. From the object itself, from the class, or from the base class of that class, until there are no more base classes. So this statement picks up the value from the class, and creates a new value in the instance. The result is s1 gets its own reference to the function test
so when is is deleted from the class, s1
still has ‘test’ even though other Samples do not.
The last example is to create a custom method, and then add this method to s2, again making one instance of a class different from others of that class.
Behaviour Summary
In a dynamic language, an object being of a certain class does not guarantee what attributes that object will have at run time, nor restrict what attributes the object will have at run time. So the type actually says nothing of any certainty at all. No certainly comes from the type, but also, no limiation follows from the type. Types are potentially completely fluid.
Static Types
Compile Time vs Run Time
With statically typed languages, the types themselves are defined at compile time. The type ‘object’ of the Python program is a table inside the compiler for a typical static program. Additional code can even create extension functions into that table in the compiler, which at least give the impression of extending a type, but again, this extension made to the table used by the compiler. The table is not even present any more at run time, so run time change of the ‘class’ is impossible. The program is read by the compiler, object code produced, and only after compilation is complete can the program be executed. So before the first line of code can be run, all definitions of class/type in side the compiler have been finalised. Within the program, there can be no variance of any class/type, as all definitions are final before the program can run.
Kotlin Example
class Sample(var a:Int){ fun test() = println(a) } fun main(){ s1 = Sample(1) s2 = Sample(2) s2.test() s1.new_test() // every instance has all for the class } fun Sample.new_test() = println(a+2)
Ok, in the above Kotlin example, the class definition is genuinely concise, but the main reason the code snippet is short, is that most of what is in the Python code is simply not possible.
This is because all the declarations such as ‘fun’ or ‘class’ are compile time declarations, describing the system to the compiler. The name of the fun’ or the class or the ‘a’ property or ‘test’ method are not even needed at run time. An object of type Sample will need space for 2 attributes, and the first will be the property ‘a’ and the second the function ‘test’. These will always be there so they can always be in the same position so there is no need at run time to look up if they are there or where in the object they will be. The compiler uses the configuration to produce the program code, which is then available to be run. So all of the declarations have been applied prior to the first line of the program being run, and modifying those declarations is no longer possible.
Every instance of a type is guaranteed to have the set of attributes described in the type, and that will not vary throughout the entire program. Even extension functions or properties, which do not actually change the underlying class, cannot be added at during program execution. Either they are present throughout the entire program, or not at all.
Restrictions: An Advantage?
At this point, the main advantage of static types of having more restrictions. Restrictions being an advantage may seem counter intuitive, but consider the quote from Robert ‘Uncle Bob’ Martin (the video is worth a full watch, but this link is to the quote):
If we have made any advances in software, since 1945, it is almost entirely in what not to do. Robert C Martin.
It is also useful to understand that ‘dynamic types’ is not a new evolution over the staid, traditional static types. In fact dynamic type date back to Lisp (second oldest language behind Fortran) in 1958 and actually predate static types. Languages commonly implemented by interpreters have traditionally been dynamically typed, and the languages evolved from ‘c’ tend to be compiled and have static types.
In the tradition of the advances discussed by ‘Uncle Bob’, restricting to static types is considered an improvement by advocates, even though you could just not make use of dynamic types in a dynamic language.
The restrictions imposed by static types, are generally considered best practice. This is a programmer should impose these restrictions on themselves, and in fact, programmers do generally impose these restrictions on themselves. There are however, the occasion where breaking the rules of the restrictions is as appealing as breaking a diet you just know you should follow. A variable should not change type- without a really good reason. Working with truely dynamic classes is generally a complete pain and very hard to follow.
In fact, Python tools such as PyCharm warn developers who making their types dynamic, by creating instance attributes in any method beyond init. But tools can only help within limits due to the very nature of dynamic language, as the developer of gradle, Hans Docktor, found in terms of limitations from dynamic languages. Hans, who as founder of Gradle generally can not find time to code anymore, took time to personally migrate the majority of 70,000 lines of code from Groovy to Kotlin required to move the project from a dynamic typed language to a static typed language.
Beating Restrictions: Converting Dynamic to Static
Convertion Code. The occasion arises when dynamic code needs to be converted from a dynamic langugage, to a static language. Some project start with dynamic languages, and move to static, but rarely in the reverse, because the larger the project, the more the advantages for static languages. As stated in restrictions: an advantage, almost no code really makes use of dynamic types or variables that change type. Almost no code. There is still some that doesn’t abide by the restrictions. When converting to Python to Kotlin, or Groovy to Kotlin, a strategy for coding these exceptions is required. Remember: All popular languages are ‘turing complete’, so they can all solve the same problems. The challenge is to do so working with the language and not against it.
Same Function, different parameters types. It is common to require alternate function signature for same method or function. Static typed languages resolve this problem with function overloading, while dynamically typed languages do not have the type information for overloading, so programs in these languages need to test parameters for type, then and use conditional code. While moving from one to another requires a re-think, neither direction is difficult, with overloaded functions but less actual code requred for static typed code.
Variables that change type: can be covered by union types. While variables should almost always be invariant with of type, and in fact, most ‘variables’ should actually be invariant of value, as per functional programming. But for the rare cases where a variable may need to be able to hold values of more than one type, there are solutions, with variables able to hold different types even in statically typed code. The most powerful solutions called unions in C, Sealed Classes in Kotlin, Union Types in Typescript, and while a search will reveal workarounds for Java, there is no specific language feature the make it easy. However, there is always a way.
Dictionaries/Maps as Objects: Groovy and Javascript allow object notation to be used with maps, which can give the impression that data which is implemented as a map is in fact an object. Of course, the use of object notation suggest a map that is in fact a ‘faux object‘, or data which should be an object stored in a map. Generally, if the object is a singleton, check for a dynamically typed object, otherwise it will normally be possible to convert to a real object.
Dynamic Classes: Why have the rules for a class, where the class itself changes during execution of a program? Generally, there is no reason. The main use of dynamic classes is where the class is build by different pieces of code, a process which is complete before objects are instantiated. To convert such code, the main requirement is to collect all the code that is used to define the class, and have all this code moved to the class definition.
Dynamically Typed Objects: A dynamically typed object is an object that has properties added to the object that are not part of the class. Making modifications to an object in this way effectively makes the object a singleton, because adding custom properites to an individual object makes the object different from all others which would nominally be the same type. So modifying an individual object, creates a singleton of a nameless class, which effectively inherits from the original class, but may now override original class members and/or have additional class attributes.
To implement such dynamic singletion objects in a static language, the best approach is to have attributes stored in a collection. Effectively use a map to store ‘class’ attributes.
Dynamic Types: Benefits
Practical Uses
The reality is dynamic types are always present in a program, but very rarely used. How often does a variable actually need to be able alter the type of data it is working with?
Most often, the advantage of dynamic types is that programmer skips needing to say what type the data is. “a=3” is shorter than “a:int=3”. A function definition that does not bother to specify types is inherently shorter than a function defintion that does specify types. The biggest use of dynamic types is that by avoiding the need to say what type, the code is that little bit shorter. Less typing is great…. but if the program is to be read by even the original author months after the code was first written, then that less typing can create the need for more documentation. In some cases, the type is obvious, and in fact in many of these cases modern languages can infer the type in these cases.
The main payback for dynamic types is to save typing, and save thinking ahead. The main disadvantage is that anyone reading the code at a later time has more work to do in being certain what the code does. This gives dynamic types an advantage for the simple program that will be written today and then forgotten. Less typing, with no real down side. But if that program will be read by different programmers in order to work with the code in future, just a few times of working out what the code does and the advantage swings back to static types.
Shorthand
Can the program remember exactly what each variable is used for without reading the code? In a short, recently written program, the answer is most often “yes”. and in that case saving typing even just a few extra letters give a short program with no negatives. Consider that “a=3” in python. In Kotlin that becomes “val a=3”. The slightly longer code gives more information to the reader. “a” is immutable, and is being declared here. In python “a” could being reasigned, or declared, the code does not say. Will “a” always be an int? In a short program, at the time of writing, the programmers knows all of these things and writing them down is an overhead. That programmer writing the code has a win. For a second programmer, trying to decipher the code, that win becomes a loss, but sometimes there is no second programmer, and the program will never be read months after it was written, making the extra text simply a waste.
Static Types: Restrictions Can Help?
Restrictions = Certainty
For a function defined in python as “def foo(a,b)”, the programmer who wrote the code may know that he only ever calls “foo” with two integer values and always returns an int, but that is not the same as the Kotlin “fun foo(a:Int, b:Int):Int” where any reader at any time, as well as the compiler and the editor, as certain that foo can only ever be called with two integer parameters, and will always return an integer.
Tooling
In the above example, the programmer could know how “foo” is used, but when anyone, even including the IDE, knows the type at any point, then the IDE options available, wheras when only the programmer knows the type, then the programmer has to research the options. This lack of certainty of type is the very limitation that made coding gradle scripts in groovy an experience that dissapointed the creator of gradle, and resulted in him moving to Kotlin as the solution.
Performance
Restricting the possible variations can have a huge effect on language performance. Consider the code example in “dynamic types” above. To call the ‘test’ method of an example of the test class, consider the steps:
- hash map access ‘test’ within object dictionary (not found)
- ‘test’ in s1.__dict__
- hash map access ‘test’ with ‘Sample’ class dictionary
- func_obj=s1.__class__.__dict__[‘test’]
- retreive __call__ property from func_obj dictionary and call
- func_obj.__dict__[‘__call__’]()
These 3 map lookup steps are required because:
- at any time a new instance attribute ‘test’ could have been added to the object which will override the class attribute
- at any time the test object within the Sample class could have been replace by a new object
- at any time the __call__ property of the Sample class could be deleted, or set to different value, which would either mean ‘test’ was no longer callable, change the code to be run when it is called.
A static typed language can avoid all of these steps. The class Sample always has exactly three attributes: init, test, and “a”. Compiler can calculate the offset from the base of the class to each attribute, so the names for the attributes are not even needed at run time. Further, if the function ‘test’ is not open, then the compiler can know with certainty what code will be called with no further calculations. An ‘open’ function could be overridden in a subclass, so loading the address of the function because a single instruction extra step, but even that step can be known to be unncessary given a stated restriction. Every restriction provides less cases to deal with, and allows more and more compiler optimisations of the code to be run.
The performace difference between these two cases of calling ‘test’ will conservatively result in a run penalty of more than 10x in just running the call. There are many articles looking at ‘why Python is so slow‘, but as discussed later, in most cases Python is just fine for speed.
Why Language Performance Usually Doesn’t Matter
The previous section highlights a hugh difference in performance potential between dynamic code with Python and and Static typed code with Kotlin. Despite that huge difference in potential, the reality is that applications in Python are usually
Code Readability
The performance limitations discussed above, are even more significant in terms of code readabiliy that performance. All the limitations stem from the fact that it is impossible even for the compiler to determine from the source code what the program should do at run time to with the same certainty for dynamic types as with statically typed programs. Computers are fast, and in most cases the result of the compiler uncertainty is more computing cycles at run time that may not even be noticable, however if a human takes even doubly as long to determine what the code should do, then this is very significant.
Simply put, less information as to what the program should do is communicated by dynamically typed code. This can save some thinking things through and time when the program is being written, in cases where the code is less thought through at time of writing then that code will require more tests and usually a longer time to ensure the code does do what is really required.
Static Types In Python? Despite some appearances, No. It is possible to add type annotations to function parameters, and to declare local, class and through data classes, even instance level variables with type annotations in Python. However these annotations are effectively comments as far as the Python interpreter is concerned since annotating a variable ‘foo’ as ‘int’ using ‘foo:int’ creates the annotation, but this annotation is ignored by the python language and foo can be set to any value. A function declared as accepting two int parameters (e.g. def test(a:int, b:int) ) can be called with parameters of any value. The type annotations are, like all comments, at risk of not matching what the code does. It is in some ways unfortunate that Python implemented the annotations without an implimentation as now any implimentation would break compatibility, but impliment is problematic on such a dynamic basis anyway.
The Efficiency Impact
Special Usage: Terminal Impact
When writing an operating system, specifically the device drivers and logic to switch processes, speed is absolutely critical. When writing a compiler, or a the internal code of the graphical user interface of an operating system, or many other system tasks, speed is again critical. When writing the internals of mathematical functions, or writing code for embedded systems, speed is also critical.
For all of these above cases, that necessary speed is only delivered by a static language compiled to native code. If system is written in Python, then a second language is needed for these use cases.
Most Programs: Zero Impact
The special usage cases are just that: special. Unusual and quite different from the programs most programmers spend their time writting. If a program is slow there are two possible reasons:
- The code implementing the special use cases called by the application is slow
- The application is not using the optimum algorithm
Application code is not really impacted by being written in Python. If code is slow the most common reason is the algorthim used by the code is slow. If that is not the problem, then some special usage code is being written in Python, and that code alone should be moved to a static typed, native compiled code language. Simple. Python is quite efficient in terms of speed when used for normal applications.
Coding Impact
Decoding Type.
Viewing the restrictions/performance section above, the Python language inspects dictionary of the ‘s1’ (Sample type) object, and when that fails, the dictionary of the type attribute for the Sample class to find the ‘test’ attribute. The language needs no type information and by this technique any object with a ‘test’ attribute in either the object of the type for the object, will succesfully yeild a value for ‘test’. With static typing, there is no run time dictionary for each object to inspect, the compiler uses the information it has about the type of the object to know if there is a test attribute. The decision that there will be a ‘test’ attribute, and how to access that attribute, is decided at compile time. This means the compiler know the type of the object at compile time.
This only makes a difference when definitive type can vary, and these are the main cases where code is impacted.
Consider a practical example, retrieveing an object from the ‘DOM’ in an html docucument. The method ‘getElementById()’ is a common way to retreive an element, but the type retrieved will always be ‘HtmlElement’. All actual elements are subclasses of HtmlElement, and have a variety of attributes become the base class. Consider a case where the HtmlElement is a text input box, and the goal is to retrieve the value in the text box as entered by the user. An HtmlElement does not have a ‘value’ attribute, so while a dynamically typed language will allow the code for 'getElementById("name").value'
and interorgate the returned value at run time to learn if there is a ‘value‘ attribute, a statically typed language will not allow the code without knowing at compile time that there is a type with a ‘value‘ attribute.
The solution in Kotlin is to cast the the returned value to a type know to have a value attribute. A cast can be a ‘smart cast’ which requires checking the type:
// either (document.getElementByID("name") as HtmlInputElement).value //or val element = document.getElementByID("name") when(element){ is HtmlInputElement -> elment.value else -> null }
Duck Typing Anyone?
The difference in the above example is the Python style of “duck typing”. For the Python code it does not matter what type is returned by getElement, all that matters is that the returned object has a ‘value’ attribute, which in the metaphor is “quacking like a duck”.
This allows any type with a ‘value’ to run. In this case, there is no benefit from allowing any type with a ‘value’, that is just a side effect in this case as there was no desire to allow another type. This use of ‘duck typing’ becomes another case of ‘no need to explain what is actually happening‘ which saves time when it is only the language needing that explanation, but will have a negative impact if others have to decipher the code. The solution can be comments: so no comments if no one need read the code, and the overhead of explaining only if the code will need to be read later. However as comments are not as good as self documenting code….. there is still a trade off. Dynamic code best if re-reading will not be needed. Static types better over the long haul if code will need to be revisited.
Note that interfaces do provide duck typing with static types, when duck typing is a goal and not just a side effect.
Conclusion.
There are special use cases where dynamically typed languages such as Python simply are not suitable (compilers, some embedded systems etc), but for the majority of cases, developers are free to choose between dynamically typed and statically typed languages. Almost always, for those first few lines of code and getting the first embryo of a program running, dynamically typed languages will be briefer and save some time.
For a large project with multiple developers, code will need to be read far more times that it is written and people have to go back and look at code written months ago. In these cases static typed code pays back the overhead of few more keypresses many times over. With Kotlin, those extra keypresses and thinking things through are minimised, but they are still present. The payback comes over time.
In between the quick one time project and the multi team long term development….. it depends on the programmers and their individual preferences.
[…] information to the class object, which can then be further modified at a later time as outlined in Static vs Dynamic […]
LikeLike