Preface
Recently, I've been working on a research project within the team. One of the existing tools was developed in Python, and my goal is to optimize its workflow. I could ask the original developer about the current process, propose an optimization based on my research, and have them implement it. However, as a programmer at heart, I want to figure things out myself, so I embarked on a journey of exploration.
I searched online, downloaded the PyCharm IDE, configured the environment, and started working. Since I had no prior experience with Python, when I opened the code folder, I was overwhelmed by a pile of .py files and had no idea where to start. My background is in C#, with a little C/C++, so I kept looking for something like a Main() function that would serve as the entry point of the program. But after searching the entire directory, I found no clues, so I began tinkering.
It should be noted that this article is just a beginner's learning note about Python, and it may not be entirely correct or complete. Any corrections are welcome. Every time I encounter a new tool or language, I feel an inexplicable joy and enjoy using old knowledge to reason about it, hence this record.
Sequential Execution
In the Python world, each .py file is a module. You can invoke the module by entering its filename in the console.
Modules are somewhat similar to batch files (.bat), where statements are executed sequentially.
This was different from my initial expectation. I thought, like C# and other languages, the file would be organized by classes, but that's not the case.
First, create a file named Test1.py in the root directory D:\ with the following content:
print("Test1 First")
print("Test1 Second")
Then, switch to the console, change the directory to D:\, and run the Test1.py module. The result is:
D:\>python Test1.py
Test1 First
Test1 Second
Great, it works as expected. Now let's try calling between modules.
Create Test2.py in the D:\ directory and import the Test1 module:
import Test1
print("Test2 First")
print("Test2 Second")
Run Test2.py in the console:
D:\>python Test2.py
Test1 First
Test1 Second
Test2 First
Test2 Second
Understandably, in Test2.py, the import Test1 statement appears at the beginning, so when the Test1 module is imported, its statements are executed, causing the output from Test1 to appear first.
If you put import Test1 at the end, the output from Test1 will also appear at the end:
print("Test2 First")
print("Test2 Second")
import Test1
D:\>python Test2.py
Test2 First
Test2 Second
Test1 First
Test1 Second
Function Definition
Can the code in a module be more flexible? Besides sequential execution, can you call code on demand, like functions in C#?
The print above should be a built-in function. After consulting documentation, I found the definition of functions in Python:
def function_name(parameter_list):
function_body
Let's try it immediately. Define a SayHello function in Test1.py:
print("Test1 First")
print("Test1 Second")
def SayHello():
print("Hello World")
SayHello()
print("Test1 Third")
Output:
D:\>python Test1.py
Test1 First
Test1 Second
Hello World
Test1 Third
Good, it meets expectations – executed sequentially.
If you only define SayHello() but don't call it, there will be no Hello World line in the output.
Next, try calling a function across modules. Modify Test2.py:
import Test1
print("Test2 First")
print("Test2 Second")
Test1.SayHello()
Output:
D:\>python Test2.py
Test1 First
Test1 Second
Hello World
Test1 Third
Test2 First
Test2 Second
Hello World
Haha, that's right! The last Hello World is output by the Test1.SayHello() statement in Test2.py.
As for the third line's Hello World earlier, it was output by the Test1 module when import Test1 was executed.
__main__
After understanding function definitions and cross-module calls, a question arises: where is the entry point of the program/module?
I searched for information and found the __name__ attribute. Let's test it with some code. Modify Test1.py:
def SayHello():
print("Hello World")
def SayBye():
print("Bye World")
SayHello()
if(__name__=="__main__"):
print("Main")
SayBye()
Run Test1.py directly in the console:
D:\>python Test1.py
Hello World
Main
Bye World
Okay, it's understandable. It executes sequentially, and because the condition if(__name__=="__main__") is met, Main is printed.
But wait, try calling Test1.py indirectly through Test2.py. First, modify Test2.py:
import Test1
print("Test2 First")
print("Test2 Second")
Then run Test2.py:
D:\>python Test2.py
Hello World
Bye World
Test2 First
Test2 Second
What? No Main output? Interesting. I found an explanation on Runoob:
Every module has a
__name__attribute. When its value is__main__, it indicates that the module itself is running; otherwise, it has been imported.
The __name__ attribute is easy to understand – it's a reserved field (attribute) of the module. But how do we understand __main__?
Here, __main__ might be understood as the program's entry function. If the module is called directly by the entry function, its __name__ attribute is __main__; otherwise, it is the module's filename:
def SayHello():
print("Hello World")
def SayBye():
print("Bye World")
SayHello()
if(__name__=="__main__"):
print("Main")
else:
print(__name__)
SayBye()
D:\>python Test2.py
Hello World
Test1
Bye World
Test2 First
Test2 Second
Summary
This article covered some basic features of Python modules. The knowledge is quite shallow, intended only to document my personal learning process.
Every time I encounter a new tool or language, I feel an inexplicable joy and enjoy using old knowledge to reason about it, hence this record.
Finally, I quote some important explanations about modules from Runoob:
- In addition to method definitions, modules can contain executable code. This code is typically used to initialize the module.
- A module is imported only once, no matter how many times you execute
import. This prevents the imported module from being executed repeatedly. - Modules can import other modules. It is a common practice to place
importstatements at the beginning of a module, although it's not strictly required.