Structure: Import, Dependencies & Repositories

Both Kotlin and Python use the keyword “import”. At a superficial level, import performs a similar role in both languages, but the steps for each languages achieve what is required are very different. Examining how import works, explores the structure of Python and Kotlin program, and how they are constructed. Import relies on dependencies, and dependencies rely on repositories, and these also also explained in this page.

  • Revision: Conceptually, What is an Import?
  • Packages, Libraries, Dependencies and Repositories?
  • Python & Kotlin Imports Compared
  • Python: Import, Dependencies & Repositories
    • Background: Modules
    • Import
    • Dependencies & Repositories
    • Environment as Dependency Declaration
  • Koltin
    • Background
      • Kotlin JVM
      • Class Files and Jar Files
      • Building and Running Kotlin
    • Import
    • Dependencies
    • Repositories

Revision: Conceptually, What is an Import?

All languages come with a standard libary: a set of standard functions, classes etc, that provide the basic building blocks as a starting point for building programs. These standard building blocks provide a vocabulary for programs. Import is a mechanism to add more building blocks and extend the vocabulary available to a program.

The standard, automatically imported, building blocks provide sufficient vocabulary simple applications, but more complex applications require more building blocks that are more specific to the application. Effectively each complex program requires its own jargon. While an application will normally build some of its own ‘building blocks’ effectively defining some jargon within the program itself, a well structured application will use jargon or more ‘buidling blocks’ from the outside the application, allowing coding using jargon common to other programs and enhancing readability, and allowing the most concise main application. This structure provides for maximum code reuse and clean code, and is all enabled by ‘import’.

As building blocks become more numerous and more specialised, automatic imports as with the standard libraries would simply be too much, hence, the explicit import.

An ‘import‘ statement adds names for the building blocks to the namespace, creating the vocabulary of the program.

While import introduces vocabulary to the program, it is also necessary for the computer to process the defintions of that vocabulary for the complete program to function. A link between the imported vocabulary, and the exact definitions of that vocabulary is required. Is the program working with version 1.0 or version 1.1 of this vocabulary, or is there a newer version available? Has another group of programmers created an alternative version or effectively an alternative defintion of the same vocabulary? There has to be system to accessing the selected building blocks, and ensuring the correct version of the building blocks are ready for import into the current program.

Packages, Libraries, Dependencies and Repositories?

Packages

Consider the analogy of configuring a new you wish to buy. Certain features come standard, and others are options. Adding an option is analogous to an import, which brings in this new feature. With cars, sometimes certain options are available as packages: a group of optional features which are all packaged together. Add the package, and you have all the included features. In software, a ‘package’ is a set of program building blocks that the be imported either as a complete set, or individually. A software package may contain only a single ‘accessory’ or program building block, but it is more common that the package contains multiple building blocks, and the program can select all or individual items from the package.

Libraries.

The first thing to note, is that a ‘library’ can be used interchangably with ‘package’ to mean the exact same thing. However, library can also be used to mean set of packages, so some libraries contain multiple packages. With car analagy, this is like a package that contains a number of sub-packages. Think of a library as either package or set of packages.

Dependencies.

Depenencies are packages or libraries that a program needs. A package or libraries is depency of a particular program when that program needs something from the package or library. The analogy would be when buying a car you may need the package that includeds the sunroof if you just have to have the sunroof, but analagies are never perfect. A package or library is like an option or set of options available for a car, while a dependency is an option or set of options you have chosen and are purchasing.

Repositories

A repository is a source or supply of packages/libraries. Most people probably select most of their car ‘options’ from the car dealership when they buy their car, and most languages have an equivant: a langauge endorsed repository or supply of packages/libraries. For Python the default repository is PyPI, and for Kotlin it is Maven central, and with each language these respective locations are the source of most packages/libraries chosen as dependencies by programs of each respective language. Both Python and Kotlin also allow for other sources of packages and libraries, so there are other repositories. In the car analogy, an other repository would be like an independant car accessory store.

Note that a repostitory can contain multiple versions or alternatives for the same dependency. Analogous to different wheels for a car, only one set can be chosen, but there are different alternatives. There may be a version 1.0 and a version 1.1 of the same package, so there is a choice of which one becomes a dependency for the current project.

Relationship: Imports, Dependencies and Respositorries.

A programmer looking to add a specific feature or building block, first must locate a repository that can ‘supply’ that feature. Then the package or library must be extracted from the repository and made available to the program as a dependency. With the dependency obtained, the program can then import either the entire dependency or items from that dependency.

Python & Kotlin Imports Compared

Python is an interpreted language. With an interpreter, all code performs an action, and with python, that includes imported code. Every line of imported code is run by Python, with any building blocks installed into the run time enviroment. Further, with as with any interpreter, running a program is a single step process, and with Python, and therefor imports are a single step.

Contrast this with the compiled language Kotlin. With a compiler, all definitions are processed prior to run time, and defintions first create information internal to the compiler. The Kotlin compiler does ‘run’ any code of an application, or from imports. For Kotlin, ‘import‘ means use the defintions of building blocks when compiling. The program is compiled with the knowledge of the external building blocks, but as with any compiler, this does not itself add the buidling blocks to the final program. Before a Kotlin program can run, there is an additional step required to connect the code references to the building blocks, with the code implementations of the those building blocks.

Python: import a single step that ensures the implementation code of the dependency has been loaded and run(triggering load and run if not), and imports the specified names from dependencies into the containing namespace.

Kotlin: import is a first step imports the relevant names from dependencies into the containing namespace, and triggers a second step to load the implementation code of the dependency that takes place at ‘link’ stage.

Python: Import, Dependencies and Repositories.

modules

Understanding Python modules is required to really understand Python ‘import'. The runtime of python is divided into modules, with each dependency consisting of one or more modules. The main module of any python program, as with the classic ‘print("hello world")‘, is the ‘__main__’ module. In a classic catch 22, you need to understand modules to understand imports, but introducing modules requires using an import:

>>> import sys
>>> type(sys.modules)  # What is sys.modules?
<class 'dict'>
>>> len(sys.modules)  # how many modules are present when idle starts?
152
>>> >>> "__main__" in sys.modules  # check __main__ is in modules?
True
>>> "typing" in sys.modules  # is there a typing module when idle starts?
False
>>> import typing
>>> "typing" in sys.modules  # re check for typing after import
True

Clearly, the Python enviroment is pre-loaded with many ‘modules‘ (152 in the example above), including ‘__main__’. Additional ‘modules' can be added to the dictionary of modules using import, as in the above example with “typing“. So what is a ‘module‘, and what does a module do?

>>> type(sys.modules["__main__"])
<class 'module'>
>>> help(type(sys.modules["__main__"]))
#  output is too long to be here....but "__dict__" is the main data in help
>>> "foo" in sys.modules["__main__"].__dict__
Flase
>>> foo = "hi"
>>> "foo" in sys.modules["__main__"].__dict__
True
>>> sys.modules["__main__"].__dict__ is globals()
True

A module is an object of module type, which is described here in the Python documentation, and has ‘help()’ that can be read (as shown in the sample code above), but the significantly detail is a module holds a __dict__ of values, which are all the globals of created by related python file.

import

As seen in ‘modules’ above, an import will load a python file into a module in ‘sys.modules‘ and then import the name of the module imported into the current globals(). Here is an example of an import. First a file to be ready to be imported:

# this is the file 'test_import.py'

foo = "Hello"
def bar():
   print("bar for test_import")

print(f'the import { __name__}, {"bar" in globals()}, {"mainbar" in globals()}')

now a test.py to try experiment with the import:

# this is test.py

def mainbar():
   print("mainbar for test")
print(f'main step 1 { __name__}, {"bar" in globals()}, {"mainbar" in globals()}')

import test_import
print(f'main step 2 { __name__}, {"bar" in globals()}, {"mainbar" in globals()}, {test_import}')

from test_import import bar
print(f'main step 3 { __name__}, {"bar" in globals()}, {"mainbar" in globals()}')

import sys
print(f'check for foo {"foo" in sys.modules["test_import"].__dict__}')

Running ‘test.py‘ should be expected to do the following:

  • define ‘mainbar’ in the __main__ module (test.py)
  • print main step 1 – (False,True) showing ‘bar’ is not in globals, but main is
  • import ‘test_import’ which will run every line of test_import
    • define foo in the the globals of ‘test_import
    • define ‘bar’ in the globals of test_import
    • print ‘the import’ and show ‘bar’ but not ‘mainbar’ in globals, showing that globals() is not the same as the globals of test.py (True, False)
  • print main step 2 (False, True, <module ‘test_import’>
    • showing ‘bar’ is still not defined within ‘test.py
    • showing test_import has now been defined, through the import
  • perform ‘from test_import import bar’
    • this import will not load test_import as it has already been loaded
      • this means none of the statements in test_import are run again
      • no second print of ‘the import’ showing statements not run again
  • print main step 3(True, True)
    • showing the import from brought ‘bar’ in to globals() for main
    • and again showing mainbar is still in globals
  • print check for foo (True)
    • showing that globals of other modules can be accessed remotely

Summary:

  • import first checks if the module is already loaded
    • python code consists of modules, which each module containing its own globals()
    • if not loaded, then the module is loaded, which executes all the code of that module
    • names as specified by import or from <file> import are added to the globals() of the module doing the import

Import vs from Import

The main role of imports is to increase the vocabulary available to the program. The python import only adds a single new ‘word’ to the vocabulary, although all of the names within the module are available as properties of that one new word, whild “import from” brings new words from the module directly into the current module global namespace.

>>> import math
>>> print(math.log(math.e))
1.0
>>> from math import log,e
>>> print(log(e))
1.0

‘import’ and ‘from import’ offer choices of vocabulary. Which is the best choice will often be decided by how frequenctly the new vocabulary will be used.

Python Dependencies

For Python, the dependencies are all installed as packages into the current environment. If a virtual environment is created specifically for each program, then the dependencies installed into the environment will be specifically for each program, but a simpler approach with Python is for all programs to share the same environment and therefore all programs share the same dependencies.

The main Python tool to install dependencies into the current evenironment is pip, and the main repository for Python is PyPI (Python Package Installer). However, as stated in the description of PIP, it is possible to use PIP with package indexes, and also with individual folders or git repositories containing program resources.

Python Repositories

PyPI is the standard Python repository, and ‘pip install <package>’ by default will install ‘package’ as a dependency from the PyPI repository. Not only are other alternatives to PyPI supported, but a dependcy can be added by ‘pip’ using the -e option, that has been cloned by git or developed on the host system as a simple ‘individual library’ repository allowing very flexible supply of dependencies.

KotlinJVM: Imports, Dependencies, Repositories

KotlinJVM?

Kotlin was first implemement targeting the Java Virtual Machine or JVM, although Kotlin now aslo targets the Javascript environment and LLVM compiler, and these targets will be discussed separately in later pages, much of the way Kotlin programs are developed for all platforms derives from the JVM environment, so examining how Kotlin works on the JVM is best first step in all cases, but a little further reading may still be required for other platforms.

The JVM is, like Python, an interpreter, but as an interpreter only for code that has already been compiled, the differences are very significant. A languge almost entirely defined by the first stage of the pipeline, and this a compiler for Kotlin and an interpreter for Python. See Languuages: compilers & interpreters.

Jar Files and Class Files.

With Python, understanding Python modules is a key concept to understanding the workings of Python, and understanding class files is a key to understand Kotlin. Python modules are objects created inside the python runtine, with each new import creating a module inside the python runtime. The Kotlin compiler produces class files, and does not itself run anything of have a run time environment. These class files also have two key roles with import. The first role is with compilation, the second run is with running the class files produced by compilation.

Class files, the output of compilation, contain both code that is compiler output, as would be the case with ‘C’ *.obj files, and definition data, as would be the case with ‘C’ *.header. They are a single location resource for two very different types of information. With the sample hello project, you can open the HelloKt.class file in Intellij and see both the declarations present, and that there is code present, altough what the code does cannot be seen.

Jar files are simply folders of class files and other files, that have been compressed as a zip file. Any zip tool such as ‘7zip‘ (windows), or Keka (MacOS) can be used to view the contents of a jar file as a zip. A jar file is the equivalent to a folder within a python project. When working with Kotlin, you can generally consider a jar file as equivalent to folder of class files.

Building & Manually Running Kotlin in Intellij

Intellij provides a ‘green arrow’ in the margin of the editor, to the left of any code which can be run independently. This ‘green arrow’ makes running code very easy, but hides the steps necessary to run that code, and this page is about examining those steps hidden by the ‘green arrow’. Specifically, the green arrow completes:

  1. a ‘compile’ step (build)
  2. then completes a second and less transparent ‘link‘ step in order to run what has been built

The compile/build step can be directly triggered by using the gradle menu ‘tasks/build/build’ entry, but it is the automatic run step that hides what is happening. Understanding dependencies requires uses a more transparent approach to seeing what is required for the link step in order to run programs. Using the ‘java’ or ‘kotlin’ command at the command line in order to reveal just what is taking place to run programs can provide greater clarity. In other words, if you wish to see what is actually happening, the green arrow should not be used to run programs, use the command line. Generally it doesnt matter what smart things Intellij is doing to help as long as we get what the outcome we want, so using the green arrow is desirable, but for when trying to see what is happening… this is when the green arrow is less useful. To manually run a Kotlin program, use the ‘View/Too1Windows/Terminal’ menu option of Intellij, and run using the ‘java’ or ‘kotlin’ commands from the command line.

Kotlin Imports

Kotlin Imports Step 1 of 2 Steps: Compile (API) time dependency

At compile time, Kotlin imports only add the defintions (the API) of what is being imported, as information for the current compilation. At compile time, the code behind the imports (or implementation) is not significant, as the compiler only updates vocabulary or names available to the program, without including the code (or implementation) for that vocabulary. This is similar to including a ‘C’ header or .h file. Unlike Python, where the code of the import is run at time of import, Kotlin import does not even require code to be present, it just requires definitions. This is similar to how the header files allow a C program to compile, but the matching program code (or *.obj or *.lib) file will be required to be linked to the program for the program to make use of the code. Kotlin import allows the code to compile, but does not include an implementation, and a separate step is required for running the compiled program.

Kotlin class files can contain both the API data, and the implementation code. Step 1 does not need the compiled implementation code, and it would be possible to complete step 1 using class files with the different compiled code than that which the program will use at run time, provided the definitions in the class files are correct. Opening a class file in Intellij shows the definitions, but just shows /* compiled code */ in place of actually revealing what is in the compiled code.

Kotlin Imports Part 2: Link Time Dependency.

The second stage of the import, ‘linking’ the code behing the imported vocabulary to the program, takes place either during an ‘assembly’ stage of a gradle build or at the time of running the program.

Automatic, Hidden Imports: Stage 1- Compile time.

Consider the classic Kotlin ‘hello world’, as with the ‘Import/File’ branch of the KotlinSandbox. With the Hello project (KotlinJVM/Hello/src/kotlin/Hello.kt), there is an automatic, hidden import of ‘println’.

No import statement is required for the ‘println’ function, because as ‘println’ is part of the standard library, the import is automatic. However, The compiler will complete the first step of the import automatically, but running the program will only succeed because the build.gradle.kts file contains the line ‘implementation(kotlin("stdlib"))‘. Comment out this line, and try to run the Hello program (the green arrow should do, but change the ‘hello world’ message so the program must build again before running). An error that ‘println’ is an unresolved error will result. Uncomment the ‘implementation(kotlin("stdlib"))‘ and the program will again run. Printlin is an automatic, and hidden import. The list of automatic imports is here, but all default or automatic imports still require a dependency for ‘link time’ to be successful.

Automatic Imports: Stage 2 – link time.

The Kotlin compiler produces a HelloKt.class file (Sandbox\KotlinJVM\Hello\build\classes\kotlin\main\HelloKt.class). Trying to run this from a shell using ‘java HelloKt‘ (make sure JAVA_HOME is set as an environment variable and %JAVA_HOME%\bin is on the path) will give ‘java.lang.NoClassDefFoundError‘ because the standard Koltin imports used in the compile (see stage 1 above), have not been automatically applied to the run time, which is the second use of the import. Running using java does not automatically provide the Kotlin standard library, which is why the error occurs. To manually include the Kotlin standard libary:

java -cp ".";".\kotlin-stdlib-1.3.11.jar" HelloKt

This command instructs as follows:

from the classes present in the current directory ("."), and the "kotlin-stdlib-1.3..11.jar" which can also be treated as a directory, run the main method of the HelloKt class. 

Running ‘Hello’ this way requires having HelloKt.class in the current directory, and copying kotlin-stdlib-1.3.11.jar to the same directory (or alternatively using “.”;”<path to stdlib\stdlib-1.3.11.jar”. This will also need appropriate changes if working with a different version of the std lib, and if working with MacOS or linux substituting “/” for “\” and using a “:’ to separate the folders of the class path, not a “;”.

Runing with the ‘-cp ".."‘ optin will find no classes, so there will be an error that HelloKt class is not found, running with ‘-cp "."‘ will find HelloKt class, but none of the stdlib will be available so this will also result in an error.

In fact the Kotlin stdlib, in turn has the java standard libraries as dependencies. So there would still be an error with the above example, except that the ‘java‘ command will automatically locate the java standard libary classpaths. In fact, there is a kotlin command which will also run the java command and automatically add all the kotlin standard libaries, but the point here is to see what is required.

Packages

See packages and Kotlin Package Namespaces.

Kotlin Import Syntax

The import statement in kotlin takes the form:

import <full package name>.<target>

Where the target can any attribute that does not require an instance of a class, and is delcared in the dependency without a ‘private’, ‘protected’ or ‘internal’ visibility modifier. This includes:

  • class (java or kotlin style)
  • a static attribute java class
  • any kotlin definition (e.g fun or var) defined without an enclosing class
  • singleton objects and/or companion objects
  • attributes of singleton/companion objects

Unlike Python, the file name is not part of the import. Locating the source of the import requires matching the package name to dependencies.

Kotlin Dependencies vs Python Dependencies.

With Python, standard dependecies are included with the installation, and others can be added using ‘pip install’.

With Kotlin, even for the automatic import of ‘stdlib‘, information on ‘dependencies’, is still required for running the program. The standard sandbox hello world can be used to show this. Convert ‘fun main’ by commenting out the ‘println’

fun main(){
    3+2 //println("hello wor ld")
}

The program will now run with the dependency block of the gradle file commented out. Add back the ‘println’ and running the program fails. The reason is the dependency information is supplied to the classpath for the run command. The same information is also supplied to the compile command, even though in the special case of the stdlib, the compiler already assumes the vocabulary is present, and with stdlib it is only the implementation that changes. Adding the dependency ensures both stages, compile time and link time, are provided with the information from the class files as needed at each stage. So depency still needs to be declared even when the import is automatic, because there are two roles of dependencies with compiled languages:

  1. dependencies at compile time
  2. dependencies at link time

Specifying Kotlin Dependencies With Gradle.

Koltlin uses class files to describe the information needed at compile and at link time. So simply declaring these dependencies once, can ensure the class files, with provide both types of data, are available to both steps.

In Python, the dependencies are added to the python environment using ‘pip install’. With Kotlin, the dependencies are added to the compile command, and the ‘link’ command. Dependencies are best specified gradle, which then allows ‘plugin’ to use the dependency information over and over.

Gradle ‘dependencies’ blocks allow for specifying dependency information, so that any ‘plugin’ such as one supporting Kotlin compile, has the information it needs.

dependencies {
     
     implementation(kotlin("stdlib")) // implementation replaces 'complile'

API vs Implementation Dependencies.

Gradle has introduced the concept of dividing dependencies, previously all described as ‘compile‘ dependencies, into ‘api‘ dependencies (direct replacements for ‘compile’ dependencies, and the new category of ‘implementation‘ dependencies. While old compile dependencies could all be changed to ‘api’ dependencies without problems, such a replacement misses the point of the change, in that most dependencies are in fact implementation dependencies, it is just that there was no way of declaring implementation dependency previously.

So what is the difference? Firstly, there is no difference unless the code using the dependency will in turn be a dependency for other code. So for the case of A depends on B which in turn depends on C:

  • C has no dependencies provides an API that can be used as a dependency
  • B has C as a dependency, and provides an API that can be used as a dependency
  • A has B as a dependency, but is not intended to be used as a dependency so has no API

Only for B does API vs implementation make a difference, because the impact on on whether in is impacted by

Kotlin JVM: Jars and Fatjars.

With the jvm, the ‘link’ command part of instruction to run a specificied ‘java’ class from the available classes. The classpath provides java with collection of classes as compiled for the application, but as seen above, the complete list of jars for all dependencies will also need to be on the included classpath. This makes for a rather complex run command, and requires shipping several different jars toghether will the class files for the main program.

However, just as all the class files for the application can be ‘zipped’ into a single jar file, just like any other zip, jar files can have sub folders, so why not zip the dependecies of the application together with the application into one all encompassing jar file? A jar file with everything is known as a ‘far jar’ and is an easy way to ship a java application that will be run by a jvm.

Generally, if a project will be used as a set of resources for another project, then that other project will use a jar of just the files from this project, and add other dependencies itself, but if this project is an application to be distributed as a single jar file, then a ‘fat jar’ is a convenient solution.

So now we have yet another usage for the dependency information collected by gradle, this same data can be used to build a ‘far jar’, used to build a class path to run the final program, and used to provide the compiler with class descriptions used at compile time.

Kotlin Repositories

Repositories are storage of files for dependencies. The main repository of Python is ‘PyPi’ and the url of ‘PyPi.org’ is so well know to pip, that there is no need to tell pip where to look if the source of the dependencies is ‘PyPi’. However you can tell pip to look for a ‘custom PyPi’ in another location on the internet, and you can host your own repositories of python dependencies. With Python, both these options are rare, while with Kotlin, use of repositories other than maven central (the equivalent of Pypi) is far more common.

Why more common with Kotlin? One reason for alternative repositories is keeping commerical project code private. Python is optimal for applications where the application will not be distributed commercially, so less reasons to keep code private. Some large commerical python code bases, go as far as using gradle as a solution for Python repositories, so the need for alternate repositories may be less common, but it certainly does exist.

Repositories are storage for multiple repositories, and typically even multiple versions of each of those repositories. Applications need few repositories, and generally all possible repositories can be specified at project root level, and then all is complete.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s