Python Code Compatibility
Check this manual often for updated information about the Splunk platform Python 3 migration. The content is subject to change.
To revise apps for compatibility with Splunk Enterprise version 8.x and higher, Python code should generally be made compatible with Python 3.7 or higher.
Python code can also be make compatible with Python versions 2 and 3, both. To accomplish this, developers should include Python compatibility libraries as needed, in this order of recommended use:
- Six
- Python-future
- 2to3
Importing compatibility libraries
These libraries are provided with Splunk Enterprise to help you transition from Python 2 to Python 3. To use them in your app, you should simply import them as needed. For example:
import six
For cross-compatibility with the Splunk platform, do not include and distribute another version of these compatibility libraries with your app.
You should properly store and import cross-compatible Python libraries and update the Python path according to guidelines provided in The directory structure of a Splunk App in Splunk developer docs.
Many patterns work in both Python 2 and Python 3. However, to write code that differs between Python 2 and Python 3, determine which version of the interpreter is being used with this code:
See the documentation for these libraries to use them to help make your Python scripts compatible with both Python 2 and Python 3. However, these libraries might not fully produce fully-compatible results, so here are a number of tips for updating scripts by hand to complete the transition to 2/3 compatible Python code.
- Fundamental tips
- Strings
- Integers
- Dictionaries and collections
- Sorting
- Modules
- Other tips
Fundamentals
Current Python version
Many patterns work in both Python 2 and Python 3. However, to write code that differs between Python 2 and Python 3, determine which version of the interpreter is being used with this code:
import sys if sys.version_info >= (3, 0): print("version 3.x") else: print("version 2.x")
Also, check the stanza-level Python-version settings of your scripts, combined with the admin-level Python-version setting in server.conf. For more information about stanza- and global-level Python version settings, see Changes to Splunk Enterprise.
Indentation
Python 3 has different indentation rules, and mixing tabs and spaces can cause problems. For best cross-compatibility, follow PEP8 and use space indentation only. For more information, see the PEP 8 Style Guide.
Print output
In Python 3, print() is now a function instead of a statement. For Python 2, do not use print statements. You can enable the print function by including as one of the first imports:
from __future__ import print_statement
In Python 2, you could cause print to not print a new line by putting a trailing comma on the argument list. For Python 2/3 compatible code, use std.stdout.write()
instead. In Python 2, you could use print >>handle "foo"
but for Python 2/3 compatible code, instead use handle.write("foo\n")
.
Strings
Handling strings properly is important when writing cross-compatible Python code. For the purpose of this discussion, in Python there are three types of string:
- native strings (default) – for example, "abc"
- binary strings – for example, b"abc"
- unicode strings – for example, u"abc"
In Python 2, a native, default string is a binary or "bytes" string. In Python 3, a native string is instead a unicode string. Python 3 is strict about not mixing different types of strings. Often, an explicit conversion is needed or a runtime error is produced.
In Python 2, default and binary strings shared the same type but in Python 3, you now must keep the two string types distinctly separated. Binary data should not be stored in a default string since Python 3 defaults to unicode strings, not binary.
Using strings effectively in Python 3 is a matter of knowing which type of data is contained in a string, and when strings must be converted between types.
Strategies for strings
In small, self-contained scripts, native strings might be avoided, but that's not likely in large, extensible projects. You might want to just use Python 2's unicode()
everywhere, or from python-future use from builtins import str
which, in Python 2.7, can cause problems. Changing the normal behavior of native strings might work inside a single script, but not likely when interfacing with larger systems.
For both Python 2 and 3, the native default string is by far the most common. The most effective strategy is usually to explicitly reference native strings and binary/bytes strings. Then, only when running on Python 3, use explicit conversions where needed.
For more tips about Python strings, including use of io.StringIO, see Other Tips later in this topic.
Different string types
Here are examples of default strings:
- "normal strings" including with r"raw escaping"
- I/O done with files like open("filename.ext", "r")
- I/O streams from StringIO
Here are examples of binary/bytes strings:
- b"bytes strings" including with br"raw escaping". Note, Python 3 also accepts rb"xxx" but only br"xxx" is portable to Python 2.
- I/O done with files like
open("filename.ext", "rb")
- I/O streams from
io.BytesIO
- data passed via stdin/stdout/stderr using
subprocess.PIPE
when running external processes - data passed to APIs like
hashlib.sha1(bytes)
- data accepted and returned by
base64.b64encode(bytes)
- data produced by
pickle
,pack()
, and similar functions
Most string operations are available on either type. However, the result of an operation will be the same type as the starting string:
>>> "a,b,c".split(',') ['a', 'b', 'c'] >>> b"a,b,c".split(b',') [b'a', b'b', b'c']
This applies to regular expressions; if you compile a regex pattern for a bytes string, the result is a pattern which can be used on bytes input.
You cannot directly mix the types of strings:
>>> "a" + b"b" Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: can only concatenate str (not "bytes") to str
Converting between string types
When a string must transition between default string and a bytes string, use decode()
and encode()
. decode()
takes a bytes string and returns a native string by decoding the bytes, which determines which unicode values the stream of bytes represents.
>>> b"\x61\xc3\xa4\x61".decode() 'aäa'
encode()
takes a default-string and returns a bytes string. It encodes individual unicode characters into the bytes needed to represent them. Both of these methods can take an encoding parameter, but default to UTF-8 which is usually appropriate.
When writing cross-compatible code, typically perform these conversions when specifically in Python 3. In Python 2, conversion methods exist, but convert between default strings and unicode strings. Typically in Python 2, default strings are used to hold both types of data, so it's best to keep them unconverted. An example of typical cross-compatible code for conversions is:
process = subprocess.Popen(['cal'], stdout=subprocess.PIPE) out, _ = process.communicate() if sys.version_info >= (3, 0): out = out.decode() print("result: " + out)
In short, always determine which data should be in strings or bytes, and only convert at the minimal boundaries between the two types.
Errors
When converting existing Python 2 code to cross-compatibility with Python 3, it's common to find a string operation fail on a type error. A common mistake when converting is to add encode()
or decode()
calls until it "works." This can quickly become a source of hidden bugs. When you have an error caused by incorrect mixing of string and binary data, ask two questions:
- Which type should the result be?
- Where does the value that isn't of the right type come from – and why is it in the other type?
Often, a single conversion of data to the correct type can avoid many later conversions. Change the mode of opening a binary file from "r" to "rb" (which is also accepted by Python 2). Convert a string early to the correct type rather than at the point of TypeError
exceptions.
In Python 3, a native string can't hold arbitrary binary data; only a bytes string can do so. If you don't have binary data in the right type of string to start, it will cause errors even if you use encode()
every time you use the string.
Finally, in Python 2, "abc"
and b"abc"
are equivalent. If the error you are getting in Python 3 involves a string constant operand being the wrong type, it's often the solution to simply mark the string constant as a b"bytes string".
Additional tips
- Don't test for types against
isinstance(x, basestring)
. If you must check that something is a "normal" string, useisinstance(x, splunk.util.string_type)
instead. - If Python 2 code is specifically expecting a u"unicode" string, use
splunk.util.unicode()
as a replacement.
s = unicode("abc") # BEFORE s = splunk.util.unicode("abc") # AFTER
json.loads()
can accept either type of string. However, XML APIs likesafe_lxml.fromstring()
should use bytes stringinput in Python 3. They can accept simple XML in either format, but if the XML document contains an\<?xml ... encoding=
header, then the XML layer will try to perform unicode decoding and will fail if input is already unicode.
Integers
Integer division
In Python 2, normal integer division returned the integer part of the result. For example, 7/2 == 3
was true. Expressions like x[0:len(x)/2]
were valid. In Python 3, this is no longer true; integer division returns the fractional part also.
One way to return just the integer result is to use the // operator, explicitly an integer division. This operator is also available in Python 2.7. Another option is to convert the result into an int, such as int(x/2)
.
floor() and ceil()
In Python 2, math.floor()
and math.ceil()
both returned a float type result. In Python 3, they return an int type result.
No long type
In Python 2, there were two integer types: int and long, which were used depending on the size of the integer. In Python 3, only int exists and it handles either numeric range automatically. Numeric constants that are explicitly long, such as 123L
must be revised to simply 123
.
Octal constants
In Python 3, octal integer constant can no longer be experessed in the form 0123
. The more explicit form 0o123
works both in Python 2.7 and Python 3.
Python exceptions
An important difference is that some older exception syntax that was accepted by Python 2.7 now an produces an error in Python 3.
Replace any statements like:
raise MyException, "msg"
With the Python 3 form:
raise MyException("msg")
Also, replace:
except MyException, ex:
With:
except MyException as ex:
ex.message
In Python 3, exception objects no longer have a .message
attribute. One way to get the message component of an exception is with str(ex)
.
However, for the exact object that ex.message
provided in Python 2, use ex.args[0]
.
This works in both Python 2.7 and Python 3.x.
StopIteration
Generator functions should not explicitly throw StopIteration
. Rather, they should return when they are finished, instead of yielding. In Python 3.7, an explicitly thrown StopIteration
produces a runtime error.
Dictionaries and other collections
The following changes apply to dict and other collection types:
- dict no longer has the
has_key()
method. Replace:
if d.has_key("x"):
With:
if "x" in d: (works in 2.x and 3.x)
- In Python 3,
dict.keys()
anddict.values()
no longer directly return lists. Instead, they return objects of typedict_keys
anddict_values
, respectively. These objects can still be iterated normally, but sometimes code that depends a list now requires explicit conversion. Replace:
l = d.keys() + [ "extra" ] # BEFORE
With:
l = list(d.keys()) + [ "extra" ] # AFTER
- Don't use
.iteritems()
,.iterkeys()
, or.itervalues()
. Rather, iterate on.items()
,.keys()
, or.values()
directly.
- Instead of calling iter.next() on an iterator, use the next(iter) function. The associated method to override this function for your own classes is __next__(self).
Sorting
The in-place lst.sort()
method is deprecated. Instead, use the portable lst = sorted(lst)
instead, at least where support for older versions of Python 2 isn't required. sorted()
works fine on Python 2.7, however.
sorted()
(and sort()
) no longer take a cmp parameter. Instead, if you are just using cmp= to sort on an attribute of each element, it's usually simple to convert to the portable key parameter. Replace:
x.sort(cmp = lambda a, b: cmp(a.meth(), b.meth())) # BEFORE
With:
x = sorted(x, key = lambda e: e.meth()) # AFTER
This form works in Python version 2.4 and later.
If the cmp parameter is more complicated, use the cmp_to_key()
function from the functools package. Replace:
x = sorted(x, cmp = my_fancy_compare_function) #BEFORE
with
x = sorted(x, key = functools.cmp_to_key(my_fancy_compare_function)) # AFTER
Convert any "hidden" cmp parameters. They are not always passed in as named parameters, as in (x.sort(my_cmp_function))
.
The top-level cmp(a, b)
operator isn't available in Python 3. Either rewrite your code not to require it, or consider using the replacement function from the splunk.util package (if your code only must function in Splunk 8.x).
Classes that define a custom __cmp__()
method in Python 2 should instead define both __eq__()
and __lt__()
methods. They can use the @total_ordering
decorator from functools:
from functools import total_ordering @total_ordering class MyObject(object): # [...] def __eq__(self, other): return self._i == other._i def __lt__(self, other): return self._i < other._i
Module-specific advice
Many common Python modules have been reorganized in Python 3. Often, common functionality can be accessed in ways that are portable to both Python 2 and Python 3. Also, don't use the syntax from package import *
. Instead, specify the subpackages needed.
StringIO and cStringIO
In Python 3, this module has moved to io.StringIO
. Python 2 also has an io.StringIO
library, which forces u"unicode" strings. For native/"default" strings under both Python 2 and Python 3, this often causes problems. One fix:
import sys if sys.version_info >= (3, 0): from io import StringIO else: from StringIO import StringIO
As covered in Strings, in Python3, normal strings and bytes/binary aren't the same. For binary data, use the io.BytesIO
object instead:
msg = b"Bytes I want to treat like a file" if sys.version_info >= (3, 0): fh = io.BytesIO(msg) else: fh = StringIO(msg)
cStringIO is gone in Python 3. Use StringIO
as above. This is a bit slower in Python 2, but faster in Python 3.
httplib
httplib is http.client in Python 3:
try: import http.client as httplib except ImportError: import httplib
urlparse
urlparse
is now urllib.parse
in Python 3. From Six, you can use six.moves.urllib
:
from six.moves.urllib import parse as urllib_parse
Or, from python-future, Six, you can use future.moves
:
from future.moves.urllib import parse as urllib_parse
Then call urllib_parse.urlparse()
, .parse_qs()
, .urlsplit()
, .urlunsplit()
, and so on.
If you need just a small number of methods, you can probe directly:
try: from urllib.parse import urlparse except ImportError: from urlparse import urlparse
urllib2
urllib2
is gone in Python 3, with most of its functionality moved to various sub-classes. Usually, convert using the six or python-future equivalents. For example, from future.moves
:
from future.moves.urllib.request import urlopen, Request from future.moves.urllib.error import HTTPError try: req = Request(url); res = urlopen(req); except HTTPError as e: sys.stderr.write("nope!\n");
You can also import full packages from future.moves:
from future.moves.urllib import error as urllib_error from future.moves.urllib import request as urllib_request
Cookie
Cookie
is now http.cookies
in Python 3.
try: from http.cookies import SimpleCookie except ImportError: from Cookie import SimpleCookie
cPickle
cPickle
is removed from Python 3.
string
string
is removed from Python 3. Mostly, string functions have become methods on str
:
x = string.strip(y) # BEFORE x = y.strip() # AFTER lst = string.split(y, ',') # BEFORE lst = y.split(',') # AFTER
This style works in both Python 2 and Python 3.
queue
Queue
is renamed to queue
in Python 3. Most Python 2 functionality is retained with:
try: import queue except ImportError: import Queue as queue
thread
thread
has been removed in Python 3. Some internals are available as _thread
:
try: import _thread as thread except ImportError: import thread
UserDict
In Python 3, UserDict
is now part of collections
:
try: from collections import UserDict except ImportError: from UserDict import UserDict
contextlib.nested
contextlib.nested()
is removed from Python 3. Use a list with the with statement:
with contextlib.nested(open("a"), open("b")) as (fh_a, fh_b): # BEFORE with open("a") as fh_a, open("b") as fh_b: # AFTER
ConfigParser
In Python 3, ConfigParser
has been renamed configparser
. However, this is not a direct library change, but an overhaul of ConfigParser
. One of the biggest differences is the default key/value delimitiers, which are = and :. ConfigParser
defaulted to =. Many Splunk .cfg and .conf files assume : is not a delemiter.
Duplicates is another default of configparser
that differs from ConfigParser
. ConfigParser
allowed duplicates by default, but configparser
throws exceptions on duplicates by default. There are many cases of .cfg and .conf files with duplicates, and the parameter that controls this is strict.
For compatibility with Python 2 and 3 when using configparser
, create instances with the following:
configparser.ConfigParser(delimiters=('='), strict=False)
Other changes
The following additional changes also apply to Python code to be made compatible with both Python 2 and 3:
- In Python 3,
xrange()
is removed. If the range is not too large, userange()
. In Python 2,range()
is available but slower thanxrange()
. In Python 3,range()
is faster. - In Python 3,
file()
is no longer callable as a function. Use theopen()
function instead with the same parameters.open()
is available in Python 2, also. - In Python 3,
os.path.walk()
is removed. Convert code to use theportable os.walk()
instead. - In Python 3, classes should not directly assign
__metaclass__
:
- BEFORE:
class MyClass(object): __metaclass__ = SomeOtherObject # AFTER: from future.utils import with_metaclass class MyClass(with_metaclass(SomeOtherObject, object)):
- In Python 3,
rawinput()
is removed. Useinput()
instead. If your code needs to work the same in Python 2 and Python 3, use:
if sys.version_info >= (3, 0): response = input("Prompt: ") else: response = raw_input("Prompt: ")
- In Python 3,
execfile()
is removed. Useexec(open(file).read())
instead. - In Python 3,
exec "code"
inglobal_map
is removed. Python 3 and Python 2 both acceptexec("code", global_map)
instead. - In Python 3,
reduce()
has been moved to thefunctools
module:
import functools functools.reduce(lambda x, y: x + y, [47, 11, 42, 13])
Python development with Splunk Enterprise | Splunk Cloud Platform |
This documentation applies to the following versions of Splunk Cloud Platform™: 8.2.2112, 8.2.2201, 8.2.2202, 8.2.2203, 9.0.2205, 9.0.2208, 9.0.2209, 9.0.2303, 9.0.2305, 9.1.2308, 9.1.2312, 9.2.2403, 9.2.2406 (latest FedRAMP release)
Feedback submitted, thanks!