Python Imports 101
Python is well known for being one of the more beginner friendly programming languages.
However, certain aspects can seem a bit mysterious and have a tendency to frustrate those lacking experience:
- Different libraries imported (standard library, third-party, user created; preinstalled vs user installed)
- Different import methods (file vs. package imports; absolute package imports vs. relative package imports)
- Different interpreters (preinstalled vs user installed; different versions, e.g. Python 2.7 vs Python 3.6)
- Different ways of installing third-party libraries (various package managers and cloud repositories)
- Virtual environments
- Creating your own libraries and making them accessible to others
Sometimes even experienced developers coming to Python from other languages can find themselves in sticky situations, e.g. “I think I did something to my system Python” or are left scratching their heads.
In this post we are going to take a look at the first two points, different libraries and import methods.
In doing so, we will provide an introduction to imports in Python, taking a “first principles” approach.
The examples in the post were run on Ubuntu 16.04. Whilst details will vary, the general concepts should be platform agnostic (although Windows users may have a harder time following).
Python interpreters and libraries
Before getting started, a few words on interpreters as without an interpreter we won’t be importing anything!
On Linux / macOS, usually at least one Python interpreter comes preinstalled.
In reality, any preinstalled, i.e. system Python interpreter should not be used for development work.
However, in this post we will do so to keep things as simple as possible and concentrate on imports.
On a clean Ubuntu OS, there is a Python 2 interpreter, e.g.
/usr/bin/python2.7
and a Python 3 interpreter, e.g.
/usr/bin/python3.5
preinstalled (from hereon examples
refer to Python 3) as well as Python libraries:
- Standard library in
/usr/lib/python3.5
- Third-party libraries in
/usr/lib/python3/dist-packages
but your mileage may vary, e.g. you might have only files and directories from the standard library preinstalled or only one interpreter.
In my case, I had directories like
requests
, bs4
in /usr/lib/python3/dist-packages
.
In your
standard library directory, you should see files like random.py
that
are part of the standard library.
At a minimum, an interpreter and the standard library should be preinstalled.
Out-of-the-box imports
If you write in a Python script or Python shell,
and assuming you have a requests
directory in
/usr/lib/python3/dist-packages
import random
import requests
requests.get('http://bbc.co.uk')
random.randint(0,10)
just works - you have successfully imported and called functions in two libraries.
A couple of points:
- The import might refer to a file (module), e.g.
random.py
or a directory (package), e.g.requests
. import
statements make no reference to any directory paths; how does Python know where to find the relevant code?
The answer is the Module Search Path (MSP).
The MSP is a variable set each time a Python script or shell is run.
The MSP is a list of directory paths.
If you write import requests
,
Python will take the first directory path in the MSP and see if
there is a file or directory called requests
in that
directory.
If it so, it will do the import. If not, it will move on to the second path in the list, etc.
What paths are in the MSP?
- “Home” - if you are running a script, this is the directory of the script; if you are in a Python shell, it is the current working directory
PYTHONPATH
environment variable- Directory containing the standard library
- Directory containing user installed third-party libraries
- Directory containing preinstalled third-party libraries
To see for yourself,
$ /usr/bin/python3.5 -m site # working directory /home/jim
sys.path = [
'/home/jim', # "home"
'/usr/lib/python3.5', # standard libraries
'/usr/local/lib/python3.5/dist-packages', # user installed third party libraries
'/usr/lib/python3/dist-packages', # preinstalled third-party libraries
]
(there maybe others too, but above are usually of most interest).
The MSP is important. If it is empty, you won’t be able to import anything…
import sys
sys.path = []
import random # ImportError: No module named 'random'
import requests # ImportError: No module named 'requests'
Importing your own code
Let’s first take a look at file imports.
File imports
Suppose you have the following directory structure
import_examples/
└── dir0
├── a.py
├── b.py
└── dir1
└── c.py
with Python files
# a.py
import b
print(b.x)
# b.py
x = 'hello'
# c.py
y = 'bye'
If we do
$ cd /path/to/import_examples
$ /usr/bin/python3.5 dir0/a.py # prints "hello"
In a.py
,
import b
is successful as the first path in the MSP is “home”, the
directory containing a.py
, i.e.
/path/to/import_examples/dir0
Thus when the interpreter runs import b
the first file it looks for is
/path/to/import_examples/dir0/b.py
which exists so its contents are imported! The interpreter then
moves on to the next line in a.py
.
So far, so good.
Now, say we modify a.py
# a.py
import b
import c # extra import
print(b.x)
print(c.y) # extra print
Then
$ /usr/bin/python3.5 dir0/a.py # prints error below
Traceback (most recent call last):
File "import_examples/dir0/a.py", line 3, in <module>
import c
ImportError: No module named 'c'
because the interpreter first looks for a file
/path/to/import_examples/dir0/c.py
which does not exist
so it moves on to the next path in the MSP and looks for a file
/usr/lib/python3.5/c.py
which also does not exist, then
/usr/local/lib/python3.5/dist-packages/c.py
and
/usr/lib/python3/dist-packages/c.py
neither of which exist. At this point, the interpreter has gone
through all the paths in the MSP without success so it throws
an ImportError
.
To get round this, we could add the path of the directory containing
c.py
, i.e.
/path/to/import_examples/dir0/dir1
to the MSP
# a.py
import sys
sys.path.append('/path/to/import_examples/dir0/dir1') # add path to MSP
import b
import c
print(b.x)
print(c.y)
or to PYTHONPATH
.
However, both methods get quickly tedious. A better way is to use package imports.
Package imports
Just as a file containing Python code is known as a module, a directory containing modules is known as a package.
a) Absolute package imports
Let’s add __init__.py
to dir1
import_examples/
└── dir0
├── a.py
├── b.py
└── dir1
├── __init__.py
└── c.py
and modify a.py
# a.py
import b
import dir1.c # absolute package import
print(b.x)
print(dir1.c.y)
Now,
$ /usr/bin/python3.5 dir0/a.py # prints "hello", "bye"
because Python allows something called absolute package imports.
An absolute package import follows import
with dot notation, e.g.
import dir1.dir2.dir3.dir4.myfile
where
dir1
, dir2
, dir3
, dir4
each contain __init__.py
(this file lets Python know the directory
it is in is a package).
The interpreter follows the same steps as before, only now it also deals
with dot notation -
it replaces the dots with operating system path separators,
e.g. /
for Linux.
In a.py
, import dir1.c
is an absolute package import.
So, the first file the interpreter looks for is
/path/to/import_examples/dir0
prepended to
dir1/c.py
, i.e.
/path/to/import_examples/dir0/dir1/c.py
which exists.
b) Relative package imports
Let’s modify our directory tree
import_examples/
└── dir0
├── a.py
├── b.py
└── dir1
├── __init__.py
├── c.py
└── dir2
└── d.py
with Python files
# a.py
import b
import dir1.c # absolute package import
print(b.x)
print(dir1.c.y)
print(dir1.c.d.z) # extra print
# b.py
x = 'hello'
# c.py
y = 'bye'
from .dir2 import d # relative package import
# d.py
from .. import c # relative package import
print(f'c.y in d.py: {c.y}')
z = 'ciao'
Now,
$ /usr/bin/python3.5 dir0/a.py # prints below
c.y in d.py: bye
hello
bye
ciao
Great!
But how does this work?
The above makes use of relative package imports which are denoted by
from
followed by dot syntax. They are relative to the file in which they appear, e.g.
from .dir2 import d
in c.py
means
Go to a directory called
dir2
located in the same directory asc.py
, then look indir2
for a file calledd.py
and
from .. import c
in d.py
means
Go to the parent directory of the directory in which
d.py
is located, then look for a file calledc.py
In relative package imports, the MSP plays no role!
c) Relative package imports - common errors
Suppose in d.py
we also wanted to import b.py
using a relative
package import
from ... import b
from ... import b
ValueError: attempted relative import beyond top-level package
Why this?
Recall a directory is only a package if it contains __init__.py
. Thus
dir1
is the only package in import_examples
. Since it is the only
package, it must be the root package for our relative package imports.
As
from ...
takes us into dir0
, i.e. above the root package dir1
, Python raises a
the above ValueError
exception.
What if we added __init__.py
to dir0
?
This does not work as dir1
is still the root package, not dir0
.
What if we added __init__.py
to dir0
and removed __init__.py
in dir1
?
Then from ... import b
works.
But import dir1.c
in a.py
doesn’t because it is an absolute package import so dir1
needs an __init__.py
.
What if we used a relative package import instead, i.e.
from .dir1 import c
? We get another error
ModuleNotFoundError: No module named '__main__.dir1'; '__main__' is not a package
because Python does not let you do relative package imports in top level scripts.
The upshot of this is we cannot do
$ /usr/bin/python3.5 dir0/a.py
and import b.py
in d.py
using a relative package import.
We have to instead use an absolute package import
# d.py
import b # absolute package import
from .. import c # relative package import
print(f'c.y in d.py: {c.y}')
z = 'ciao'
Conclusion
As the last example shows, package imports in Python are not always straightforward even for relatively simple use cases.
However, despite this, my view is they are
still much more preferable to modifying PYTHONPATH
or
sys.path
.
Further, package imports are in widespread use, so even if you don’t use them yourself, understanding them will be helpful when reading others’ code.
Disclaimer: In no way, shape, or form do I claim all the content in this post to be my own work / not copied, paraphrased, or derived in any other way from an external source.
To the best of my knowledge, all sources used are referenced. If you feel strongly about any of the content in this post from a plagarism, copyright, etc. point of view, please do not hesitate to get in touch to discuss and resolve the situation.