Python 2.6 有什么新变化
- 作者:
- A.M. Kuchling (amk at amk.ca)
本文介绍了 Python 2.6 的新特性,它发布于 2008 年 10 月 1 日。发布日程说明见 PEP 361 [https://peps.python.org/pep-0361/]。
Python 2.6 的主题是为迁移到 Python 3.0 做准备,这是 Python 语言的一次重大重新设计。 只要有可能,Python 2.6 就会纳入 3.0 的新特性和语法,同时通过不删除旧特性或语法来保持与现有代码的兼容。 当无法做到这一点时,Python 2.6 会尽力而为,在 future_builtins
模块中添加兼容性函数,并通过 -3
开关来警告将在 3.0 中变得不支持的用法。
标准库中增加了一些重要的新包,如 multiprocessing
和 json
模块等,但与 Python 3.0 完全无关联的新特性并不多。
Python 2.6 还对整个源代码进行了大量改进和错误修复。 通过搜索更改日志我们发现在 Python 2.5 和 2.6 之间应用了 259 个补丁并修复了 612 个错误。 这两个数字可能都被低估了。
本文并不试图提供新特性的完整规范说明,而是提供一个方便的概览。 要了解完整的细节,请参阅 Python 2.6 的文档。 如果你想了解有关设计和实现的具体考量,请参阅特定新特性 的 PEP。 在可能的情况下,“Python 有什么新变化”为每个更改的错误修正/补丁项提供链接。
Python 3.0
Python版本2.6和3.0的开发周期是同步的,两个版本的alpha和beta版本是在同一天发布的。3.0的发展影响了2.6中的许多功能。
Python 3.0 是对 Python 的大范围重新设计,打破了与 2.x 系列的兼容性。 这意味着现有的 Python 代码需要进行一些转换才能在 Python 3.0 上运行。 不过,并非 3.0 中的所有更改都会破坏兼容性。 在新特性不会导致现有代码崩溃的情况下,它们会被回溯到 2.6,并在本文档的适当位置进行描述。 部分 3.0 衍生功能包括:
用于将对象转换为复数的
__complex__()
方法。用于捕获异常的替代语法:
except TypeError as exc
。增加
functools.reduce()
作为内置reduce()
函数的同义词。
Python 3.0 新增了一些内置函数并对部分现有内置函数的语法进行了修改。 在 3.0 中新增的函数如 bin()
已直接添加到 Python 2.6 中,但现有内置函数则未修改;替代做法是在 future_builtins
模块中包含具有 3.0 新语法的版本。 要与 3.0 兼容的代码可以在必要时执行 from future_builtins import hex, map
。
一个新的命令行开关 -3
可以对 Python 3.0 将移除的特性发出警告。 你可以使用该开关运行代码,以了解将代码移植到 3.0 所需的工作量。 Python 代码可以使用布尔型变量 sys.py3kwarning
访问该开关的值,C 扩展代码可以使用 Py_Py3kWarningFlag
访问该开关的值。
参见
3xxx 系列 PEP 包含针对 Python 3.0 的提议。 PEP 3000 [https://peps.python.org/pep-3000/] 描述了 Python 3.0 的开发进程。 从 PEP 3100 [https://peps.python.org/pep-3100/] 开始描述 Python 3.0 的主要目标,然后继续列出提议具体特性的更高数字的 PEP。
开发过程的变化
在开发2.6时,Python开发过程经历了两个重大变化:我们从SourceForge的问题跟踪程序切换到定制的Roundup安装,文档从LaTeX转换为reStructuredText。
新问题追踪:简述
很长一段时间以来,Python开发人员对SourceForge的bug跟踪器越来越恼火。SourceForge的托管解决方案不允许进行大量定制;例如,无法定制问题的生命周期。
The infrastructure committee of the Python Software Foundation therefore posted a call for issue trackers, asking volunteers to set up different products and import some of the bugs and patches from SourceForge. Four different trackers were examined: Jira [https://www.atlassian.com/software/jira/], Launchpad [https://launchpad.net/], Roundup [https://roundup.sourceforge.io/], and Trac [https://trac.edgewall.org/]. The committee eventually settled on Jira and Roundup as the two candidates. Jira is a commercial product that offers no-cost hosted instances to free-software projects; Roundup is an open-source project that requires volunteers to administer it and a server to host it.
在发出志愿者号召后,在https://bugs.python.org的一个Roundup的安装可以托管多个跟踪器,现在该服务器还托管Jython和Python网站的问题跟踪器。它肯定会在未来找到其他用途。在可能的情况下,此版本的“What's New in Python”链接到每个更改的bug/补丁项。
Hosting of the Python bug tracker is kindly provided by Upfront Systems [https://upfrontsoftware.co.za] of Stellenbosch, South Africa. Martin von Löwis put a lot of effort into importing existing bugs and patches from SourceForge; his scripts for this import operation are at https://svn.python.org/view/tracker/importer/
and may be useful to other projects wishing to move from SourceForge to Roundup.
参见
- https://bugs.python.org
Python 的错误追踪器
Jython 的错误追踪器
Roundup 下载和文档。
- Martin von Löwis 的转换脚本。
新的文档格式:使用 Sphinx 的 reStructuredText
自 1989 年左右项目启动以来,Python 文档一直使用 LaTeX 编写。在 1980 年代和 1990 年代早期,大多数文档都是打印出来供日后学习的,而不是在网上查看。 LaTeX 被广泛使用,因为它既能提供美观的打印输出,又能在掌握了标记的基本规则后直接进行编写。
如今 LaTeX 仍被用于编写印刷出版物,但编程工具的格局已经发生了变化。 我们不再打印成堆的文档,取而代之的是在线浏览,HTML 已成为最重要的支持格式。 不幸的是,将 LaTeX 转换为 HTML 相当复杂,长期担任 Python 文档编辑的 Fred L. Drake Jr. 花了许多时间在维护转换过程上。 偶尔有人会建议将文档转换成 SGML,之后再转换成 XML,但进行良好的转换是一项艰巨的任务,从来没有人投入所需的时间来完成这项工作。
在 2.6 开发周期中,Georg Brandl 投入了大量精力来构建一个新的工具链,用于处理文档。由此产生的软件包名为 Sphinx,可从 https://www.sphinx-doc.org/ 获取。
Sphinx 专注于 HTML 输出,可生成吸引人风格的现代 HTML;通过转换为 LaTeX,仍可支持打印输出。输入格式是 reStructuredText,这是一种支持自定义扩展和指令的标记语法,在 Python 社区很常用。
Sphinx 是一个可用于写文档的独立软件包,将近二十多个其他项目 (列在 Sphinx 网站 [https://www.sphinx-doc.org/en/master/examples.html] 上) 已采用 Sphinx 作为其文档工具。
参见
- Documenting Python [https://devguide.python.org/documenting/]
描述如何编写Python文档。
Sphinx工具链的文档和代码。
- reStructuredText 的基础解析器和工具集。
PEP 343: "with" 语句
在 Python 2.5 之前的版本中,"with
" 语句是一个可选功能,可以通过 from __future__ import with_statement
指令启用。 在 2.6 中,该语句不再需要特别启用;这意味着 with
现在总是一个关键字。 本节的其余部分是“Python 2.5 新特性”文档中相应部分的复制;如果您熟悉 Python 2.5 中的 'with
' 语句,可以跳过本节。
The 'with
' statement clarifies code that previously would use try…finally
blocks to ensure that clean-up code is executed. In this section, I'll discuss the statement as it will commonly be used. In the next section, I'll examine the implementation details and show how to write objects for use with this statement.
The 'with
' 语是一种基本结构如下所示的流程控制结构:
- with expression [as variable]:
- with-block
表达式会被求值,并且其结果应为一个支持上下文协议的对象(即具有 __enter__()
和 __exit__()
方法)。
The object's __enter__()
is called before with-block is executed and therefore can run set-up code. It also may return a value that is bound to the name variable, if given. (Note carefully that variable is not assigned the result of expression.)
After execution of the with-block is finished, the object's __exit__()
method is called, even if the block raised an exception, and can therefore run clean-up code.
一些标准 Python 对象现在已支持上下文管理协议并可被用于 'with
' 语句。 文件对象就是一个例子:
- with open('etcpasswd', 'r') as f:
- for line in f:
- print line
- ... 更多处理代码 ...
在此语句被执行之后,文件对象 f 将被自动关闭,即使是当 for
循环在代码块中间引发了异常的时候也是如此。
备注
在此情况下,f 就是由 open()
所创建的对象,因为 __enter__()
会返回 self。
threading
模块的加锁和条件变量也支持 'with
' 语句:
- lock = threading.Lock()
- with lock:
- # 关键代码段
- ...
这个锁会在代码块被执行之前锁定并总是会在代码块完成之后释放。
The localcontext()
function in the decimal
module makes it easy to save and restore the current decimal context, which encapsulates the desired precision and rounding characteristics for computations:
- from decimal import Decimal, Context, localcontext
- # Displays with default precision of 28 digits
- v = Decimal('578')
- print v.sqrt()
- with localcontext(Context(prec=16)):
- # All code in this block uses a precision of 16 digits.
- # The original context is restored on exiting the block.
- print v.sqrt()
编写上下文管理器
Under the hood, the 'with
' statement is fairly complicated. Most people will only use 'with
' in company with existing objects and don't need to know these details, so you can skip the rest of this section if you like. Authors of new objects will need to understand the details of the underlying implementation and should keep reading.
在更高层级上对于上下文管理器协议的解释:
The expression is evaluated and should result in an object called a "context manager". The context manager must have
__enter__()
and__exit__()
methods.The context manager's
__enter__()
method is called. The value returned is assigned to VAR. If noas VAR
clause is present, the value is simply discarded.BLOCK 中的代码会被执行。
If BLOCK raises an exception, the context manager's
__exit__()
method is called with three arguments, the exception details (type, value, traceback
, the same values returned bysys.exc_info()
, which can also beNone
if no exception occurred). The method's return value controls whether an exception is re-raised: any false value re-raises the exception, andTrue
will result in suppressing it. You'll only rarely want to suppress the exception, because if you do the author of the code containing the 'with
' statement will never realize anything went wrong.If BLOCK didn't raise an exception, the
__exit__()
method is still called, but type, value, and traceback are allNone
.
Let's think through an example. I won't present detailed code but will only sketch the methods necessary for a database that supports transactions.
(For people unfamiliar with database terminology: a set of changes to the database are grouped into a transaction. Transactions can be either committed, meaning that all the changes are written into the database, or rolled back, meaning that the changes are all discarded and the database is unchanged. See any database textbook for more information.)
Let's assume there's an object representing a database connection. Our goal will be to let the user write code like this:
- db_connection = DatabaseConnection()
- with db_connection as cursor:
- cursor.execute('insert into ...')
- cursor.execute('delete from ...')
- # ... 更多操作 ...
The transaction should be committed if the code in the block runs flawlessly or rolled back if there's an exception. Here's the basic interface for DatabaseConnection
that I'll assume:
- class DatabaseConnection:
- # Database interface
- def cursor(self):
- "Returns a cursor object and starts a new transaction"
- def commit(self):
- "Commits current transaction"
- def rollback(self):
- "Rolls back current transaction"
The __enter__()
method is pretty easy, having only to start a new transaction. For this application the resulting cursor object would be a useful result, so the method will return it. The user can then add as cursor
to their 'with
' statement to bind the cursor to a variable name.
- class DatabaseConnection:
- ...
- def __enter__(self):
- # Code to start a new transaction
- cursor = self.cursor()
- return cursor
The __exit__()
method is the most complicated because it's where most of the work has to be done. The method has to check if an exception occurred. If there was no exception, the transaction is committed. The transaction is rolled back if there was an exception.
In the code below, execution will just fall off the end of the function, returning the default value of None
. None
is false, so the exception will be re-raised automatically. If you wished, you could be more explicit and add a return
statement at the marked location.
- class DatabaseConnection:
- ...
- def __exit__(self, type, value, tb):
- if tb is None:
- # 没有异常,因此提交
- self.commit()
- else:
- # 发生异常,因此回滚。
- self.rollback()
- # 返回 False
contextlib 模块
The contextlib
module provides some functions and a decorator that are useful when writing objects for use with the 'with
' statement.
The decorator is called contextmanager()
, and lets you write a single generator function instead of defining a new class. The generator should yield exactly one value. The code up to the yield
will be executed as the __enter__()
method, and the value yielded will be the method's return value that will get bound to the variable in the 'with
' statement's as
clause, if any. The code after the yield
will be executed in the __exit__()
method. Any exception raised in the block will be raised by the yield
statement.
Using this decorator, our database example from the previous section could be written as:
- from contextlib import contextmanager
- @contextmanager
- def db_transaction(connection):
- cursor = connection.cursor()
- try:
- yield cursor
- except:
- connection.rollback()
- raise
- else:
- connection.commit()
- db = DatabaseConnection()
- with db_transaction(db) as cursor:
- ...
The contextlib
module also has a nested(mgr1, mgr2, …)
function that combines a number of context managers so you don't need to write nested 'with
' statements. In this example, the single 'with
' statement both starts a database transaction and acquires a thread lock:
- lock = threading.Lock()
- with nested (db_transaction(db), lock) as (cursor, locked):
- ...
Finally, the closing()
function returns its argument so that it can be bound to a variable, and calls the argument's .close()
method at the end of the block.
- import urllib, sys
- from contextlib import closing
- with closing(urllib.urlopen('http://www.yahoo.com')) as f:
- for line in f:
- sys.stdout.write(line)
参见
- PEP 343 [https://peps.python.org/pep-0343/] - "with" 语句
- PEP written by Guido van Rossum and Nick Coghlan; implemented by Mike Bland, Guido van Rossum, and Neal Norwitz. The PEP shows the code generated for a '
with
' statement, which can be helpful in learning how the statement works.
contextlib
模块的文档。
PEP 366: 从主模块显式相对导入
Python 的 -m
开关允许将一个模块作为脚本来运行。 当你运行一个位于某个包内的模块时,相对导入将无法正确运作。
The fix for Python 2.6 adds a module.__package__
attribute. When this attribute is present, relative imports will be relative to the value of this attribute instead of the __name__
attribute.
PEP 302-style importers can then set __package__
as necessary. The runpy
module that implements the -m
switch now does this, so relative imports will now work correctly in scripts running from inside a package.
PEP 370: 分用户的 site-packages 目录
When you run Python, the module search path sys.path
usually includes a directory whose path ends in "site-packages"
. This directory is intended to hold locally installed packages available to all users using a machine or a particular site installation.
Python 2.6 引入了一个用于用户专属站点目录的惯例。 该目录根据具体系统平台各不相同:
Unix 和 Mac OS X:
~/.local/
Windows:
%APPDATA%/Python
Within this directory, there will be version-specific subdirectories, such as lib/python2.6/site-packages
on Unix/Mac OS and Python26/site-packages
on Windows.
If you don't like the default directory, it can be overridden by an environment variable. PYTHONUSERBASE
sets the root directory used for all Python versions supporting this feature. On Windows, the directory for application-specific data can be changed by setting the APPDATA
environment variable. You can also modify the site.py
file for your Python installation.
The feature can be disabled entirely by running Python with the -s
option or setting the PYTHONNOUSERSITE
environment variable.
参见
- PEP 370 [https://peps.python.org/pep-0370/] - 分用户的 site-packages 目录
- PEP 由 Christian Heimes 撰写并实现
PEP 371: 多任务处理包
The new multiprocessing
package lets Python programs create new processes that will perform a computation and return a result to the parent. The parent and child processes can communicate using queues and pipes, synchronize their operations using locks and semaphores, and can share simple arrays of data.
The multiprocessing
module started out as an exact emulation of the threading
module using processes instead of threads. That goal was discarded along the path to Python 2.6, but the general approach of the module is still similar. The fundamental class is the Process
, which is passed a callable object and a collection of arguments. The start()
method sets the callable running in a subprocess, after which you can call the is_alive()
method to check whether the subprocess is still running and the join()
method to wait for the process to exit.
Here's a simple example where the subprocess will calculate a factorial. The function doing the calculation is written strangely so that it takes significantly longer when the input argument is a multiple of 4.
- import time
- from multiprocessing import Process, Queue
- def factorial(queue, N):
- "Compute a factorial."
- # If N is a multiple of 4, this function will take much longer.
- if (N % 4) == 0:
- time.sleep(.05 * N/4)
- # Calculate the result
- fact = 1L
- for i in range(1, N+1):
- fact = fact * i
- # Put the result on the queue
- queue.put(fact)
- if __name__ == '__main__':
- queue = Queue()
- N = 5
- p = Process(target=factorial, args=(queue, N))
- p.start()
- p.join()
- result = queue.get()
- print 'Factorial', N, '=', result
A Queue
is used to communicate the result of the factorial. The Queue
object is stored in a global variable. The child process will use the value of the variable when the child was created; because it's a Queue
, parent and child can use the object to communicate. (If the parent were to change the value of the global variable, the child's value would be unaffected, and vice versa.)
Two other classes, Pool
and Manager
, provide higher-level interfaces. Pool
will create a fixed number of worker processes, and requests can then be distributed to the workers by calling apply()
or apply_async()
to add a single request, and map()
or map_async()
to add a number of requests. The following code uses a Pool
to spread requests across 5 worker processes and retrieve a list of results:
- from multiprocessing import Pool
- def factorial(N, dictionary):
- "Compute a factorial."
- ...
- p = Pool(5)
- result = p.map(factorial, range(1, 1000, 10))
- for v in result:
- print v
这会产生以下输出:
- 1
- 39916800
- 51090942171709440000
- 8222838654177922817725562880000000
- 33452526613163807108170062053440751665152000000000
- ...
The other high-level interface, the Manager
class, creates a separate server process that can hold master copies of Python data structures. Other processes can then access and modify these data structures using proxy objects. The following example creates a shared dictionary by calling the dict()
method; the worker processes then insert values into the dictionary. (Locking is not done for you automatically, which doesn't matter in this example. Manager
's methods also include Lock()
, RLock()
, and Semaphore()
to create shared locks.)
- import time
- from multiprocessing import Pool, Manager
- def factorial(N, dictionary):
- "Compute a factorial."
- # Calculate the result
- fact = 1L
- for i in range(1, N+1):
- fact = fact * i
- # Store result in dictionary
- dictionary[N] = fact
- if __name__ == '__main__':
- p = Pool(5)
- mgr = Manager()
- d = mgr.dict() # Create shared dictionary
- # Run tasks using the pool
- for N in range(1, 1000, 10):
- p.apply_async(factorial, (N, d))
- # Mark pool as closed -- no more tasks can be added.
- p.close()
- # Wait for tasks to exit
- p.join()
- # Output results
- for k, v in sorted(d.items()):
- print k, v
这将产生如下输出:
- 1 1
- 11 39916800
- 21 51090942171709440000
- 31 8222838654177922817725562880000000
- 41 33452526613163807108170062053440751665152000000000
- 51 15511187532873822802242430164693032110632597200169861120000...
参见
multiprocessing
模块的文档。
- PEP 371 [https://peps.python.org/pep-0371/] - 添加多任务处理包
- PEP 由 Jesse Noller 和 Richard Oudkerk 撰写,由 Richard Oudkerk 和 Jesse Noller 实现
PEP 3101: 高级字符串格式
In Python 3.0, the %
operator is supplemented by a more powerful string formatting method, format()
. Support for the str.format()
method has been backported to Python 2.6.
In 2.6, both 8-bit and Unicode strings have a .format()
method that treats the string as a template and takes the arguments to be formatted. The formatting template uses curly brackets ({
, }
) as special characters:
- >>> # Substitute positional argument 0 into the string.
- >>> "User ID: {0}".format("root")
- 'User ID: root'
- >>> # Use the named keyword arguments
- >>> "User ID: {uid} Last seen: {last_login}".format(
- ... uid="root",
- ... last_login = "5 Mar 2008 07:20")
- 'User ID: root Last seen: 5 Mar 2008 07:20'
Curly brackets can be escaped by doubling them:
- >>> "Empty dict: {{}}".format()
- "Empty dict: {}"
Field names can be integers indicating positional arguments, such as {0}
, {1}
, etc. or names of keyword arguments. You can also supply compound field names that read attributes or access dictionary keys:
- >>> import sys
- >>> print 'Platform: {0.platform}\nPython version: {0.version}'.format(sys)
- Platform: darwin
- Python version: 2.6a1+ (trunk:61261M, Mar 5 2008, 20:29:41)
- [GCC 4.0.1 (Apple Computer, Inc. build 5367)]'
- >>> import mimetypes
- >>> 'Content-type: {0[.mp4]}'.format(mimetypes.types_map)
- 'Content-type: video/mp4'
Note that when using dictionary-style notation such as [.mp4]
, you don't need to put any quotation marks around the string; it will look up the value using .mp4
as the key. Strings beginning with a number will be converted to an integer. You can't write more complicated expressions inside a format string.
So far we've shown how to specify which field to substitute into the resulting string. The precise formatting used is also controllable by adding a colon followed by a format specifier. For example:
- >>> # 字段 0:左对齐,填充至 15 个字符
- >>> # 字段 1:右对齐,填充至 6 个字符
- >>> fmt = '{0:15} ${1:>6}'
- >>> fmt.format('Registration', 35)
- 'Registration $ 35'
- >>> fmt.format('Tutorial', 50)
- 'Tutorial $ 50'
- >>> fmt.format('Banquet', 125)
- 'Banquet $ 125'
格式说明符可以通过嵌套来引用其他字段:
- >>> fmt = '{0:{1}}'
- >>> width = 15
- >>> fmt.format('Invoice #1234', width)
- 'Invoice #1234 '
- >>> width = 35
- >>> fmt.format('Invoice #1234', width)
- 'Invoice #1234 '
可以指定所需宽度内的字段对齐方式:
字符 | 效果 |
---|---|
< (默认) | 左对齐 |
> | 右对齐 |
^ | 居中对齐 |
= | (仅适用于数字类型)在符号后加空格。 |
Format specifiers can also include a presentation type, which controls how the value is formatted. For example, floating-point numbers can be formatted as a general number or in exponential notation:
- >>> '{0:g}'.format(3.75)
- '3.75'
- >>> '{0:e}'.format(3.75)
- '3.750000e+00'
A variety of presentation types are available. Consult the 2.6 documentation for a complete list; here's a sample:
b
| 二进制。输出以2为底的数字。 |
c
| 字符。在打印之前将整数转换为相应的Unicode字符。 |
d
| 十进制整数。 输出以 10 为基数的数字。 |
o
| 八进制格式。 输出以 8 为基数的数字。 |
x
| 十六进制格式。 输出以 16 为基数的数字,使用小写字母表示 9 以上的数码。 |
e
| 指数表示法。用字母 'e' 以科学计数法打印数字以表示指数。 |
g
| General format. This prints the number as a fixed-point number, unless the number is too large, in which case it switches to 'e' exponent notation. |
n
| Number. This is the same as 'g' (for floats) or 'd' (for integers), except that it uses the current locale setting to insert the appropriate number separator characters. |
%
| Percentage. Multiplies the number by 100 and displays in fixed ('f') format, followed by a percent sign. |
Classes and types can define a __format__()
method to control how they're formatted. It receives a single argument, the format specifier:
- def __format__(self, format_spec):
- if isinstance(format_spec, unicode):
- return unicode(str(self))
- else:
- return str(self)
There's also a format()
builtin that will format a single value. It calls the type's __format__()
method with the provided specifier:
- >>> format(75.6564, '.2f')
- '75.66'
参见
- 格式字符串语法
格式字段的参考文档。
PEP 3101 [https://peps.python.org/pep-3101/] - 高级字符串格式
- PEP 由 Eric V. Smith 撰写并实现
PEP 3105: print
改为函数
在 Python 3.0 中 print
语句变成了 print()
函数。 将 print()
变成函数使得可以通过 def print(…)
或从其他地方导入一个新函数来替换该函数。
Python 2.6 提供了 __future__
导入语句来移除 print
语法,让你可以改用函数形式。 例如:
- >>> from __future__ import print_function
- >>> print('# of entries', len(dictionary), file=sys.stderr)
新函数的签名为:
- def print(*args, sep=' ', end='\n', file=None)
形参包括:
args: 相应值将会被打印的位置参数。
sep: 分隔符,它将在参数之间被打印。
end: 结束文本,它将在所有参数输出完毕之后被打印。
file: 将被作为输出发送目标的文件对象。
参见
- PEP 3105 [https://peps.python.org/pep-3105/] - print 改为函数
- PEP 由 Georg Brandl 撰写
PEP 3110: 异常处理的变更
Python 程序员偶尔会犯的一个错误是编写这样的代码:
- try:
- ...
- except TypeError, ValueError: # 错误!
- ...
The author is probably trying to catch both TypeError
and ValueError
exceptions, but this code actually does something different: it will catch TypeError
and bind the resulting exception object to the local name "ValueError"
. The ValueError
exception will not be caught at all. The correct code specifies a tuple of exceptions:
- try:
- ...
- except (TypeError, ValueError):
- ...
This error happens because the use of the comma here is ambiguous: does it indicate two different nodes in the parse tree, or a single node that's a tuple?
Python 3.0 makes this unambiguous by replacing the comma with the word "as". To catch an exception and store the exception object in the variable exc
, you must write:
- try:
- ...
- except TypeError as exc:
- ...
Python 3.0 will only support the use of "as", and therefore interprets the first example as catching two different exceptions. Python 2.6 supports both the comma and "as", so existing code will continue to work. We therefore suggest using "as" when writing new Python code that will only be executed with 2.6.
参见
- PEP 3110 [https://peps.python.org/pep-3110/] - 在 Python 3000 中捕获异常
- PEP 由 Collin Winter 撰写并实现
PEP 3112: 字节字面值
Python 3.0 adopts Unicode as the language's fundamental string type and denotes 8-bit literals differently, either as b'string'
or using a bytes
constructor. For future compatibility, Python 2.6 adds bytes
as a synonym for the str
type, and it also supports the b''
notation.
The 2.6 str
differs from 3.0's bytes
type in various ways; most notably, the constructor is completely different. In 3.0, bytes([65, 66, 67])
is 3 elements long, containing the bytes representing ABC
; in 2.6, bytes([65, 66, 67])
returns the 12-byte string representing the str()
of the list.
The primary use of bytes
in 2.6 will be to write tests of object type such as isinstance(x, bytes)
. This will help the 2to3 converter, which can't tell whether 2.x code intends strings to contain either characters or 8-bit bytes; you can now use either bytes
or str
to represent your intention exactly, and the resulting code will also be correct in Python 3.0.
There's also a __future__
import that causes all string literals to become Unicode strings. This means that \u
escape sequences can be used to include Unicode characters:
- from __future__ import unicode_literals
- s = ('\u751f\u3080\u304e\u3000\u751f\u3054'
- '\u3081\u3000\u751f\u305f\u307e\u3054')
- print len(s) # 12 Unicode characters
在 C 层级上,Python 3.0 将重命名现有的 8 位字符串类型,从 Python 2.x 中的 PyStringObject
改为 PyBytesObject
。 Python 2.6 使用 #define
来支持使用 PyBytesObject()
, PyBytes_Check()
, PyBytes_FromStringAndSize()
等名称,以及所有用于字符串的其他函数。
bytes
类型的实例与字符串一样属于不可变对象。 新增的 bytearray
类型则用于存储可变的字节序列:
- >>> bytearray([65, 66, 67])
- bytearray(b'ABC')
- >>> b = bytearray(u'\u21ef\u3244', 'utf-8')
- >>> b
- bytearray(b'\xe2\x87\xaf\xe3\x89\x84')
- >>> b[0] = '\xe3'
- >>> b
- bytearray(b'\xe3\x87\xaf\xe3\x89\x84')
- >>> unicode(str(b), 'utf-8')
- u'\u31ef \u3244'
字节数组支持大部分的字符串类型方法,如 startswith()
/endswith()
, find()
/rfind()
,以及列表的某些方法,如 append()
, pop()
和 reverse()
。
- >>> b = bytearray('ABC')
- >>> b.append('d')
- >>> b.append(ord('e'))
- >>> b
- bytearray(b'ABCde')
也有一个相应的 C API,包含 PyByteArray_FromObject()
, PyByteArray_FromStringAndSize()
以及各种其他函数。
参见
- PEP 3112 [https://peps.python.org/pep-3112/] - Python 3000 中的字节字面值
- PEP 由 Jason Orendorff 撰写, 补丁2.6 由 Christian Heimes 撰写。
PEP 3116: 新 I/O 库
Python's builtin file objects support a number of methods, but file-like objects don't necessarily support all of them. Objects that imitate files usually support read()
and write()
, but they may not support readline()
, for example. Python 3.0 introduces a layered I/O library in the io
module that separates buffering and text-handling features from the fundamental read and write operations.
There are three levels of abstract base classes provided by the io
module:
RawIOBase
defines raw I/O operations:read()
,readinto()
,write()
,seek()
,tell()
,truncate()
, andclose()
. Most of the methods of this class will often map to a single system call. There are alsoreadable()
,writable()
, andseekable()
methods for determining what operations a given object will allow.
Python 3.0 has concrete implementations of this class for files and sockets, but Python 2.6 hasn't restructured its file and socket objects in this way.
BufferedIOBase
is an abstract base class that buffers data in memory to reduce the number of system calls used, making I/O processing more efficient. It supports all of the methods ofRawIOBase
, and adds araw
attribute holding the underlying raw object.
There are five concrete classes implementing this ABC. BufferedWriter
and BufferedReader
are for objects that support write-only or readonly usage that have a seek()
method for random access. BufferedRandom
objects support read and write access upon the same underlying stream, and BufferedRWPair
is for objects such as TTYs that have both read and write operations acting upon unconnected streams of data. The BytesIO
class supports reading, writing, and seeking over an in-memory buffer.
TextIOBase
: Provides functions for reading and writing strings (remember, strings will be Unicode in Python 3.0), and supporting universal newlines.TextIOBase
defines thereadline()
method and supports iteration upon objects.
There are two concrete implementations. TextIOWrapper
wraps a buffered I/O object, supporting all of the methods for text I/O and adding a buffer
attribute for access to the underlying object. StringIO
simply buffers everything in memory without ever writing anything to disk.
(In Python 2.6, io.StringIO
is implemented in pure Python, so it's pretty slow. You should therefore stick with the existing StringIO
module or cStringIO
for now. At some point Python 3.0's io
module will be rewritten into C for speed, and perhaps the C implementation will be backported to the 2.x releases.)
In Python 2.6, the underlying implementations haven't been restructured to build on top of the io
module's classes. The module is being provided to make it easier to write code that's forward-compatible with 3.0, and to save developers the effort of writing their own implementations of buffering and text I/O.
参见
- PEP 3116 [https://peps.python.org/pep-3116/] - 新 I/O
- PEP written by Daniel Stutzbach, Mike Verdone, and Guido van Rossum. Code by Guido van Rossum, Georg Brandl, Walter Doerwald, Jeremy Hylton, Martin von Löwis, Tony Lownds, and others.
PEP 3118: 修改缓冲区协议
The buffer protocol is a C-level API that lets Python types exchange pointers into their internal representations. A memory-mapped file can be viewed as a buffer of characters, for example, and this lets another module such as re
treat memory-mapped files as a string of characters to be searched.
The primary users of the buffer protocol are numeric-processing packages such as NumPy, which expose the internal representation of arrays so that callers can write data directly into an array instead of going through a slower API. This PEP updates the buffer protocol in light of experience from NumPy development, adding a number of new features such as indicating the shape of an array or locking a memory region.
The most important new C API function is PyObject_GetBuffer(PyObject *obj, Py_buffer *view, int flags)
, which takes an object and a set of flags, and fills in the Py_buffer
structure with information about the object's memory representation. Objects can use this operation to lock memory in place while an external caller could be modifying the contents, so there's a corresponding PyBuffer_Release(Py_buffer *view)
to indicate that the external caller is done.
The flags argument to PyObject_GetBuffer()
specifies constraints upon the memory returned. Some examples are:
PyBUF_WRITABLE
指明内存必须是可写的。PyBUF_LOCK
请求一个内存上的只读或独占锁。PyBUF_C_CONTIGUOUS
和PyBUF_F_CONTIGUOUS
需要 C 连续(最后一个维度变动最快)或 Fortran 连续(第一个维度变动最快)的数组布局。
两个用于 PyArg_ParseTuple()
的新参数代码 s*
和 z*
,将为形参返回锁定的缓冲区对象。
参见
- PEP 3118 [https://peps.python.org/pep-3118/] - 修改缓冲区协议
- PEP 由 Travis Oliphant 和 Carl Banks 撰写,由 Travis Oliphant 实现。
PEP 3119: 抽象基类
Some object-oriented languages such as Java support interfaces, declaring that a class has a given set of methods or supports a given access protocol. Abstract Base Classes (or ABCs) are an equivalent feature for Python. The ABC support consists of an abc
module containing a metaclass called ABCMeta
, special handling of this metaclass by the isinstance()
and issubclass()
builtins, and a collection of basic ABCs that the Python developers think will be widely useful. Future versions of Python will probably add more ABCs.
Let's say you have a particular class and wish to know whether it supports dictionary-style access. The phrase "dictionary-style" is vague, however. It probably means that accessing items with obj[1]
works. Does it imply that setting items with obj[2] = value
works? Or that the object will have keys()
, values()
, and items()
methods? What about the iterative variants such as iterkeys()
? copy()
and update()
? Iterating over the object with iter()
?
The Python 2.6 collections
module includes a number of different ABCs that represent these distinctions. Iterable
indicates that a class defines __iter__()
, and Container
means the class defines a __contains__()
method and therefore supports x in y
expressions. The basic dictionary interface of getting items, setting items, and keys()
, values()
, and items()
, is defined by the MutableMapping
ABC.
You can derive your own classes from a particular ABC to indicate they support that ABC's interface:
- import collections
- class Storage(collections.MutableMapping):
- ...
Alternatively, you could write the class without deriving from the desired ABC and instead register the class by calling the ABC's register()
method:
- import collections
- class Storage:
- ...
- collections.MutableMapping.register(Storage)
For classes that you write, deriving from the ABC is probably clearer. The register()
method is useful when you've written a new ABC that can describe an existing type or class, or if you want to declare that some third-party class implements an ABC. For example, if you defined a PrintableType
ABC, it's legal to do:
- # Register Python's types
- PrintableType.register(int)
- PrintableType.register(float)
- PrintableType.register(str)
Classes should obey the semantics specified by an ABC, but Python can't check this; it's up to the class author to understand the ABC's requirements and to implement the code accordingly.
To check whether an object supports a particular interface, you can now write:
- def func(d):
- if not isinstance(d, collections.MutableMapping):
- raise ValueError("Mapping object expected, not %r" % d)
Don't feel that you must now begin writing lots of checks as in the above example. Python has a strong tradition of duck-typing, where explicit type-checking is never done and code simply calls methods on an object, trusting that those methods will be there and raising an exception if they aren't. Be judicious in checking for ABCs and only do it where it's absolutely necessary.
You can write your own ABCs by using abc.ABCMeta
as the metaclass in a class definition:
- from abc import ABCMeta, abstractmethod
- class Drawable():
- __metaclass__ = ABCMeta
- @abstractmethod
- def draw(self, x, y, scale=1.0):
- pass
- def draw_doubled(self, x, y):
- self.draw(x, y, scale=2.0)
- class Square(Drawable):
- def draw(self, x, y, scale):
- ...
In the Drawable
ABC above, the draw_doubled()
method renders the object at twice its size and can be implemented in terms of other methods described in Drawable
. Classes implementing this ABC therefore don't need to provide their own implementation of draw_doubled()
, though they can do so. An implementation of draw()
is necessary, though; the ABC can't provide a useful generic implementation.
You can apply the @abstractmethod
decorator to methods such as draw()
that must be implemented; Python will then raise an exception for classes that don't define the method. Note that the exception is only raised when you actually try to create an instance of a subclass lacking the method:
- >>> class Circle(Drawable):
- ... pass
- ...
- >>> c = Circle()
- Traceback (most recent call last):
- File "<stdin>", line 1, in <module>
- TypeError: Can't instantiate abstract class Circle with abstract methods draw
- >>>
Abstract data attributes can be declared using the @abstractproperty
decorator:
- from abc import abstractproperty
- ...
- @abstractproperty
- def readonly(self):
- return self._x
Subclasses must then define a readonly()
property.
参见
- PEP 3119 [https://peps.python.org/pep-3119/] - 引入抽象基类
- PEP written by Guido van Rossum and Talin. Implemented by Guido van Rossum. Backported to 2.6 by Benjamin Aranguren, with Alex Martelli.
PEP 3127: 整型文字支持和语法
Python 3.0 changes the syntax for octal (base-8) integer literals, prefixing them with "0o" or "0O" instead of a leading zero, and adds support for binary (base-2) integer literals, signalled by a "0b" or "0B" prefix.
Python 2.6 doesn't drop support for a leading 0 signalling an octal number, but it does add support for "0o" and "0b":
- >>> 0o21, 2*8 + 1
- (17, 17)
- >>> 0b101111
- 47
The oct()
builtin still returns numbers prefixed with a leading zero, and a new bin()
builtin returns the binary representation for a number:
- >>> oct(42)
- '052'
- >>> future_builtins.oct(42)
- '0o52'
- >>> bin(173)
- '0b10101101'
The int()
and long()
builtins will now accept the "0o" and "0b" prefixes when base-8 or base-2 are requested, or when the base argument is zero (signalling that the base used should be determined from the string):
- >>> int ('0o52', 0)
- 42
- >>> int('1101', 2)
- 13
- >>> int('0b1101', 2)
- 13
- >>> int('0b1101', 0)
- 13
参见
- PEP 3127 [https://peps.python.org/pep-3127/] - 整型文字支持和语法
- PEP written by Patrick Maupin; backported to 2.6 by Eric Smith.