The topic of this post is more about timezones in Python because plenty has already been written and documented about timezones in Python. It was a surprise to me and not immediately apparent from other documentation, however, that three letter timezone symbols (e.g. PST / EST) is not a robust approach to timezones. For that reason I felt the world could benefit from another blog post on the subject.
A datetime string is parsed in Python as follows
A datetime string with a timezone component in the format of ±HHMM[SS] with respect to UTC is parsed as follows
from datetime import datetime dt = '1 Jan 2018 02:00:01' fmt = '%d %b %Y %H:%M:%S' datetime.strptime(dt, fmt) Out: datetime.datetime(2018, 1, 1, 2, 0, 1)
Python raises an error, however, when a datetime string with a timezone component in the format of a 3-letter timezone symbol is used.
dt = '1 Jan 2018 02:00:01 +0400' fmt = '%d %b %Y %H:%M:%S %z' datetime.strptime(dt, fmt) Out: datetime.datetime(2018, 1, 1, 2, 0, 1, tzinfo=datetime.timezone(datetime.timedelta(seconds=14400)))
In this case Python is not able to parse the timezone part of the string, EST, even though current documentation suggests it should be able to do so. The fundamental reason for this is because a timezone with name EST could at sometime (now or in the past or future) be an ambiguous timezone name. It does not have to be that way - it just is because of the inconsistent use of timezone names and daylight savings time by everyone involved. In fact, the Python
dt = '1 Jan 2018 02:00:01 EST' fmt = '%d %b %Y %H:%M:%S %Z' datetime.strptime(dt, fmt) ValueError: time data '1 Jan 2018 02:00:01 EST' does not match format '%d %b %Y %H:%M:%S %Z'
datetimemodule clearly has documented that it does not separate between ambiguous datetimes and will never do so. The only timezone names Python can currently understand are those in the Olson database, which is accessed by the
The official list of timezone names recognized by Python is in the
In : pytz? Type: module Docstring: datetime.tzinfo timezone definitions generated from the Olson timezone database: ftp://elsie.nci.nih.gov/pub/tz*.tar.gz
In : pytz.all_timezones Out: ['Africa/Abidjan', 'Africa/Accra', ... 'US/Pacific', 'US/Samoa', ... 'Zulu']
According to other sources and my own experience, the cleanest way to use timezone strings in Python is to first define a timezone object with a string from the
pytz.all_timezones list. An aware datetime object is then created by localizing the timezone object to a naive datetime object (e.g.
The aware datetime object has all the info it needs now.
import pytz from datetime import datetime tz = pytz.timezone('US/Pacific') dt_naive = datetime.now() dt_aware = tz.localize(dt_naive)
What do you do if your timezone string is not in the supported list of
pytz.all_timezones. My current timezone of PST (San Diego) happens to be one of those. The reason it is not supported is because right now in the world PST has different meanings depending on who is asking. This is the confusing thing about timezones. Very commmon abbreviations are used in different parts of the world with different local meanings and that leads to the ambiguity that Python correctly does not want to handle.
To help with this case, I wrote a function that returns a pandas DataFrame of all possible meanings of a timezone string localized to a naive datetime object. It means that if you start with PST as a timezone string (not supported), you will be able to convert PST to a timezone string that is supported, assuming you have enough source context. The function is called
get_utc_times and its use case is below.
So if you are in Mexico BajaSur then you refer to local time as PST and you are minus 6 hours with respect to UTC. But if you are up the way in Mexico BajaNorte then you agree with BajaSur that local time is referred to as PST however your time is minus 7 hours with respect to UTC. Though other more interesting things can happen in Mexico, it is great that you now understand this aspect of timezones in Python. The
In : get_utc_times(datetime.now(), 'PST') Out: datetime zone equivalent zones datetime UTC hrs wrt UTC 0 2018-09-16 19:41:52.648372 PST America/Bahia_Banderas 2018-09-17 00:41:52.648372+00:00 -5 1 2018-09-16 19:41:52.648372 PST America/Boise 2018-09-17 01:41:52.648372+00:00 -6 2 2018-09-16 19:41:52.648372 PST America/Creston 2018-09-17 02:41:52.648372+00:00 -7 3 2018-09-16 19:41:52.648372 PST America/Dawson 2018-09-17 02:41:52.648372+00:00 -7 ... 20 2018-09-16 19:41:52.648372 PST Mexico/BajaNorte 2018-09-17 02:41:52.648372+00:00 -7 21 2018-09-16 19:41:52.648372 PST Mexico/BajaSur 2018-09-17 01:41:52.648372+00:00 -6 22 2018-09-16 19:41:52.648372 PST PST8PDT 2018-09-17 02:41:52.648372+00:00 -7 23 2018-09-16 19:41:52.648372 PST US/Pacific 2018-09-17 02:41:52.648372+00:00 -7
get_utc_timesfunction is available on github.