Python Code Compatibility
Check this manual often for updated information about the Splunk platform Python 3 migration. The content is subject to change.
To revise apps for compatibility with Splunk Enterprise version 8.0, Python code should generally be made compatible with Python versions 2 and 3, both. To accomplish this, developers should include Python compatibility libraries as needed, in this order of recommended use:
See the documentation for these libraries to use them to help make your Python scripts compatible with both Python 2 and Python 3. However, these libraries might not fully produce fully-compatible results, so here are a number of tips for updating scripts by hand to complete the transition to 2/3 compatible Python code.
- Fundamental tips
- Dictionaries and collections
- Other tips
Current Python version
Many patterns work in both Python 2 and Python 3. However, to write code that differs between Python 2 and Python 3, determine which version of the interpreter is being used with this code:
import sys if sys.version_info >= (3, 0): print("version 3.x") else: print("version 2.x")
Also, check the stanza-level Python-version settings of your scripts, combined with the admin-level Python-version setting in server.conf. For more information about stanza- and global-level Python version settings, see Changes to Splunk Enterprise.
Python 3 has different indentation rules, and mixing tabs and spaces can cause problems. For best cross-compatibility, follow PEP8 and use space indentation only. For more information, see the PEP 8 Style Guide.
In Python 3, print() is now a function instead of a statement. For Python 2, do not use print statements. You can enable the print function by including as one of the first imports:
from __future__ import print_statement
In Python 2, you could cause print to not print a new line by putting a trailing comma on the argument list. For Python 2/3 compatible code, use
std.stdout.write() instead. In Python 2, you could use
print >>handle "foo" but for Python 2/3 compatible code, instead use
Handling strings properly is important when writing cross-compatible Python code. For the purpose of this discussion, in Python there are three types of string:
- native strings (default) – for example, "abc"
- binary strings – for example, b"abc"
- unicode strings – for example, u"abc"
In Python 2, a native, default string is a binary or "bytes" string. In Python 3, a native string is instead a unicode string. Python 3 is strict about not mixing different types of strings. Often, an explicit conversion is needed or a runtime error is produced.
In Python 2, default and binary strings shared the same type but iIn Python 3, you now must keep the two string types distinctly separated. Binary data should not be stored in a default string since Python 3 defaults to unicode strings, not binary.
Using strings effectively in Python 3 is a matter of knowing which type of data is contained in a string, and when strings must be converted between types.
Strategies for strings
In small, self-contained scripts, native strings might be avoided, but that's not likely in large, extensible projects. You might want to just use Python 2's
unicode() everywhere, or from python-future use
from builtins import str which, in Python 2.7, can cause problems. Changing the normal behavior of native strings might work inside a single script, but not likely when interfacing with larger systems.
For both Python 2 and 3, the native default string is by far the most common. The most effective strategy is usually to explicitly reference native strings and binary/bytes strings. Then, only when running on Python 3, use explicit conversions where needed.
For more tips about Python strings, including use of io.StringIO, see Other Tips later in this topic.
Different string types
Here are examples of default strings:
- "normal strings" including with r"raw escaping"
- I/O done with files like open("filename.ext", "r")
- I/O streams from StringIO
Here are examples of binary/bytes strings:
- b"bytes strings" including with br"raw escaping". Note, Python 3 also accepts rb"xxx" but only br"xxx" is portable to Python 2.
- I/O done with files like
- I/O streams from
- data passed via stdin/stdout/stderr using
subprocess.PIPEwhen running external processes
- data passed to APIs like
- data accepted and returned by
- data produced by
pack(), and similar functions
Most string operations are available on either type. However, the result of an operation will be the same type as the starting string:
>>> "a,b,c".split(',') ['a', 'b', 'c'] >>> b"a,b,c".split(b',') [b'a', b'b', b'c']
This applies to regular expressions; if you compile a regex pattern for a bytes string, the result is a pattern which can be used on bytes input.
You cannot directly mix the types of strings:
>>> "a" + b"b" Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: can only concatenate str (not "bytes") to str
Converting between string types
When a string must transition between default string and a bytes string, use
decode() takes a bytes string and returns a native string by decoding the bytes, which determines which unicode values the stream of bytes represents.
>>> b"\x61\xc3\xa4\x61".decode() 'aäa'
encode() takes a default-string and returns a bytes string. It encodes individual unicode characters into the bytes needed to represent them. Both of these methods can take an encoding parameter, but default to UTF-8 which is usually appropriate.
When writing cross-compatible code, typically perform these conversions when specifically in Python 3. In Python 2, conversion methods exist, but convert between default strings and unicode strings. Typically in Python 2, default strings are used to hold both types of data, so it's best to keep them unconverted. An example of typical cross-compatible code for conversions is:
process = subprocess.Popen(['cal'], stdout=subprocess.PIPE) out, _ = process.communicate() if sys.version_info >= (3, 0): out = out.decode() print("result: " + out)
In short, always determine which data should be in strings or bytes, and only convert at the minimal boundaries between the two types.
When converting existing Python 2 code to cross-compatibility with Python 3, it's common to find a string operation fail on a type error. A common mistake when converting is to add
decode() calls until it "works." This can quickly become a source of hidden bugs. When you have an error caused by incorrect mixing of string and binary data, ask two questions:
- Which type should the result be?
- Where does the value that isn't of the right type come from – and why is it in the other type?
Often, a single conversion of data to the correct type can avoid many later conversions. Change the mode of opening a binary file from "r" to "rb" (which is also accepted by Python 2). Convert a string early to the correct type rather than at the point of <codeTypeError exceptions.
In Python 3, a native string can't hold arbitrary binary data; only a bytes string can do so. If you don't have binary data in the right type of string to start, it will cause errors even if you use
encode() every time you use the string.
Finally, in Python 2,
b"abc" are equivalent. If the error you are getting in Python 3 involves a string constant operand being the wrong type, it's often the solution to simply mark the string constant as a b"bytes string".
- Don't test for types against
isinstance(x, basestring). If you must check that something is a "normal" string, use
- If Python 2 code is specifically expecting a u"unicode" string, use
splunk.util.unicode()as a replacement.
s = unicode("abc") # BEFORE s = splunk.util.unicode("abc") # AFTER
json.loads()can accept either type of string. However, XML APIs like
safe_lxml.fromstring()should use bytes stringinput in Python 3. They can accept simple XML in either format, but if the XML document contains an
\<?xml ... encoding=header, then the XML layer will try to perform unicode decoding and will fail if input is already unicode.
In Python 2, normal integer division returned the integer part of the result. For example,
7/2 == 3 was true. Expressions like
x[0:len(x)/2] were valid. In Python 3, this is no longer true; integer division returns the fractional part also.
One way to return just the integer result is to use the // operator, explicitly an integer division. This operator is also available in Python 2.7. Another option is to convert the result into an int, such as
floor() and ceil()
In Python 2,
math.ceil() both returned a float type result. In Python 3, they return an int type result.
No long type
In Python 2, there were two integer types: int and long, which were used depending on the size of the integer. In Python 3, only int exists and it handles either numeric range automatically. Numeric constants that are explicitly long, such as
123L must be revised to simply
In Python 3, octal integer constant can no longer be experessed in the form
0123. The more explicit form
0o123 works both in Python 2.7 and Python 3.
An important difference is that some older exception syntax that was accepted by Python 2.7 now an produces an error in Python 3.
Replace any statements like:
raise MyException, "msg"
With the Python 3 form:
except MyException, ex:
except MyException as ex:
In Python 3, exception objects no longer have a
.message attribute. One way to get the message component of an exception is with
However, for the exact object that
ex.message provided in Python 2, use
This works in both Python 2.7 and Python 3.x.
Generator functions should not explicitly throw
StopIteration. Rather, they should return when they are finished, instead of yielding. In Python 3.7, an explicitly thrown
StopIteration produces a runtime error.
Dictionaries and other collections
The following changes apply to dict and other collection types:
- dict no longer has the
if "x" in d: (works in 2.x and 3.x)
- In Python 3,
dict.values()no longer directly return lists. Instead, they return objects of type
dict_values, respectively. These objects can still be iterated normally, but sometimes code that depends a list now requires explicit conversion. Replace:
l = d.keys() + [ "extra" ] # BEFORE
l = list(d.keys()) + [ "extra" ] # AFTER
- Don't use
.itervalues(). Rather, iterate on
- Instead of calling iter.next() on an iterator, use the next(iter) function. The associated method to override this function for your own classes is __next__(self).
lst.sort() method is deprecated. Instead, use the portable
lst = sorted(lst) instead, at least where support for older versions of Python 2 isn't required.
sorted() works fine on Python 2.7, however.
sort()) no longer take a cmp parameter. Instead, if you are just using cmp= to sort on an attribute of each element, it's usually simple to convert to the portable key parameter. Replace:
x.sort(cmp = lambda a, b: cmp(a.meth(), b.meth())) # BEFORE
x = sorted(x, key = lambda e: e.meth()) # AFTER
This form works in Python version 2.4 and later.
If the cmp parameter is more complicated, use the
cmp_to_key() function from the functools package. Replace:
x = sorted(x, cmp = my_fancy_compare_function) #BEFORE
x = sorted(x, key = functools.cmp_to_key(my_fancy_compare_function)) # AFTER
Convert any "hidden" cmp parameters. They are not always passed in as named parameters, as in
cmp(a, b) operator isn't available in Python 3. Either rewrite your code not to require it, or consider using the replacement function from the splunk.util package (if your code only must function in Splunk 8.0 and later).
Classes that define a custom
__cmp__() method in Python 2 should instead define both
__lt__() methods. They can use the
@total_ordering decorator from functools:
from functools import total_ordering @total_ordering class MyObject(object): # [...] def __eq__(self, other): return self._i == other._i def __lt__(self, other): return self._i < other._i
Many common Python modules have been reorganized in Python 3. Often, common functionality can be accessed in ways that are portable to both Python 2 and Python 3. Also, don't use the syntax
from package import *. Instead, specify the subpackages needed.
StringIO and cStringIO
In Python 3, this module has moved to
io.StringIO. Python 2 also has an
io.StringIO library, which forces u"unicode" strings. For native/"default" strings under both Python 2 and Python 3, this often causes problems. One fix:
import sys if sys.version_info >= (3, 0): from io import StringIO else: from StringIO import StringIO
As covered in Strings, in Python3, normal strings and bytes/binary aren't the same. For binary data, use the
io.BytesIO object instead:
msg = b"Bytes I want to treat like a file" if sys.version_info >= (3, 0): fh = io.BytesIO(msg) else: fh = StringIO(msg)
cStringIO is gone in Python 3. Use
StringIO as above. This is a bit slower in Python 2, but faster in Python 3.
httplib is http.client in Python 3:
try: import http.client as httplib except ImportError: import httplib
urlparse is now
urllib.parse in Python 3. From Six, you can use
from six.moves.urllib import parse as urllib_parse
Or, from pyton-future, Six, you can use
from future.moves.urllib import parse as urllib_parse
.urlunsplit(), and so on.
If you need just a small number of methods, you can probe directly:
try: from urllib.parse import urlparse except ImportError: from urlparse import urlparse
urllib2 is gone in Python 3, with most of its functionality moved to various sub-classes. Usually, convert using the six or python-future equivalents. For example, from
from future.moves.urllib.request import urlopen, Request from future.moves.urllib.error import HTTPError try: req = Request(url); res = urlopen(req); except HTTPError as e: sys.stderr.write("nope!\n");
You can also import full packages from future.moves:
from future.moves.urllib import error as urllib_error from future.moves.urllib import request as urllib_request
Cookie is now
http.cookies in Python 3.
try: from http.cookies import SimpleCookie except ImportError: from Cookie import SimpleCookie
cPickle is removed from Python 3.
string is removed from Python 3. Mostly, string functions have become methods on
x = string.strip(y) # BEFORE x = y.strip() # AFTER lst = string.split(y, ',') # BEFORE lst = y.split(',') # AFTER
This style works in both Python 2 and Python 3.
Queue is renamed to
queue in Python 3. Most Python 2 functionality is retained with:
try: import queue except ImportError: import Queue as queue
thread has been removed in Python 3. Some internals are available as
try: import _thread as thread except ImportError: import thread
In Python 3,
UserDict is now part of
try: from collections import UserDict except ImportError: from UserDict import UserDict
contextlib.nested() is removed from Python 3. Use a list with the with statement:
with contextlib.nested(open("a"), open("b")) as (fh_a, fh_b): # BEFORE with open("a") as fh_a, open("b") as fh_b: # AFTER
In Python 3,
ConfigParser has been renamed
configparser. However, this is not a direct library change, but an overhaul of
ConfigParser. One of the biggest differences is the default key/value delimitiers, which are = and :.
ConfigParser defaulted to =. Many Splunk .cfg and .conf files assume : is not a delemiter.
Duplicates is another default of
configparser that differs from
ConfigParser allowed duplicates by default, but
configparser throws exceptions on duplicates by default. There are many cases of .cfg and .conf files with duplicates, and the parameter that controls this is strict.
For compatibility with Python 2 and 3 when using
configparser, create instances with the following:
The following additional changes also apply to Python code to be made compatible with both Python 2 and 3:
- In Python 3,
xrange()is removed. If the range is not too large, use
range(). In Python 2,
range()is available but slower than
xrange(). In Python 3,
- In Python 3,
file()is no longer callable as a function. Use the
open()function instead with the same parameters.
open()is available in Python 2, also.
- In Python 3,
os.path.walk()is removed. Convert code to use the
- In Python 3, classes should not directly assign
class MyClass(object): __metaclass__ = SomeOtherObject # AFTER: from future.utils import with_metaclass class MyClass(with_metaclass(SomeOtherObject, object)):
- In Python 3,
rawinput()is removed. Use
input()instead. If your code needs to work the same in Python 2 and Python 3, use:
if sys.version_info >= (3, 0): response = input("Prompt: ") else: response = raw_input("Prompt: ")
- In Python 3,
execfile()is removed. Use
- In Python 3,
global_mapis removed. Python 3 and Python 2 both accept
- In Python 3,
reduce()has been moved to the
import functools functools.reduce(lambda x, y: x + y, [47, 11, 42, 13])
Python development with Splunk Enterprise
This documentation applies to the following versions of Splunk® Enterprise: 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.0.6, 7.0.7, 7.0.8, 7.0.9, 7.0.10, 7.0.11, 7.0.13, 7.1.0, 7.1.1, 7.1.2, 7.1.3, 7.1.4, 7.1.5, 7.1.6, 7.1.7, 7.1.8, 7.1.9, 7.1.10, 7.2.0, 7.2.1, 7.2.2, 7.2.3, 7.2.4, 7.2.5, 7.2.6, 7.2.7, 7.2.8, 7.2.9, 7.2.10, 7.3.0, 7.3.1, 7.3.2, 7.3.3, 7.3.4, 7.3.5, 8.0.0, 8.0.1, 8.0.2, 8.0.3