zeit-online package

Article Class

Article

The article Class is another central Class of the zeit-online package and allows us to extract a lot of metadate and useful information about an Article. Sadly, Zeit Online does not allow one to get the full Text, just a teaser text and sometimes a snippet, but we can still get a lot of value from it.

Initilization

Normally, you get an Article Class as a Return from a API.get_article(…) call, but you can also initiate it yourself, altough rather tedious. After Initilization, the Class has a bunch of different Attributes, for example you can get the Name, title etc.

Attributes:

  • self.title: the title of the article
  • self.href: the link to the article on zeit.de
  • self.text: the teaser text
  • self.supertitle: the supertitle associated with the article
  • self.id : the UUID of the article
  • self.uri: the Unique Resource Identifier URI of the Article

Example

#search for an article
api.search_for("Bundestag", limit=1)

Output:

Search for 'Bundestag': 42945 results, limit: 1, matches : 
 
CSU-Politiker Nüßlein zieht sich aus der Politik zurück: http://api.zeit.de/content/7zJHkLefQuFQKQMgMx6WXW

now we get the article

article = api.get_article("7zJHkLefQuFQKQMgMx6WXW") #get the article we just searched for
print(article)

Output:

Article with title 'CSU-Politiker Nüßlein zieht sich aus der Politik zurück' UUID: 7zJHkLefQuFQKQMgMx6WXW, URI: http://api.zeit.de/content/7zJHkLefQuFQKQMgMx6WXW 
teaser_text: 'Nach Korruptionsvorwürfen kandidiert der Unionsfraktionsvize nicht wieder für den Bundestag. Er war in Masken-Geschäfte involviert. Von den Koalitionspartnern kam Kritik.'

get_keywords()

this method allows you to extract the keywords from the article. It returns a dictionary of form { uri : name, uri : name …} to make it easy to extract the keywords by using dictionary.keys() method

Example

article.get_keywords() #extract the keywords from the article

Output:

{'http://api.zeit.de/keyword/csu': 'CSU',
 'http://api.zeit.de/keyword/bundestag': 'Bundestag',
 'http://api.zeit.de/keyword/bundestagswahl': 'Bundestagswahl',
 'http://api.zeit.de/keyword/korruption': 'Korruption',
 'http://api.zeit.de/keyword/staatsanwaltschaft': 'Staatsanwaltschaft'}

get_authors()

get the authors who have written the article or contributed . This also returns a dictionary of form {uri:name} for the same reason as above

Example:

article.get_authors() #get the authors associated with the article

Output:

{'http://api.zeit.de/author/Tilman-Steffen': 'Tilman Steffen'}

get_date()

The API returns as the release date a ISO time, and to make it easier to extract it, this function exists. It returns a simple Pandas Timestamp

Example:

article.get_date()

Output:

Timestamp('2021-03-05 16:04:47+0000', tz='UTC')

Notebook

Take a look at the article.ipynb notebook to understand it further

This project is maintained by skriptum