Living Topics

Show the pulse of what the Swiss Government is working on

14

In this project, we briefly explored the TERMDAT API and collected some basic data from each Swiss government level and department, comparing the results with Wikipedia and media sources. Our data and Python notebooks can be found in the Sources.

See also:

RestructFeedback
GovTech Hackathon 2023 Share

Use AI for the evaluation of consultation procedures

Challenge

Show the pulse of what the Swiss Government is working on, by combining up-to-date terminology databases with texts published by the Government.

Words, phrases, and terms in a governmental context have one special feature: As they often relate to legal concepts and are the base of far-reaching decisions, standing terms (also called “named entities” in computer science) need to be well defined, often by a legal base. To be able to work with this domain-specific vocabularies, these standing terms are organized in terminology databases and can be accessed through TermDat (BkTermdatUi ). The goal of this challenge is to harness the expressivity and freshness of the terminologies provided by TermDat to create a high-quality map of what topics the Swiss Government is currently working on.

"The Swiss Confederation currently has almost 40k employees (~36k full time equivalents). It is not only difficult for the citizens of Switzerland to grasp the breadth and depth of which topics are worked on, the same is valid for the employees within the Swiss Confederation. Therefore, it is important to have a good high-quality overview of the ongoing work and the change of focus in regard of ongoing themes globally. Conversely, the specialists working on the terminology database are not able to read all new texts being published every day and would welcome a tool which would allow them to harvest newly coined terms or terms being used in a new context or with a different translation. Finally, end-users of texts might want to be able to click on a given (technical) word and receive a definition of that word."

Organization: BK Federal Chancellery

Prepared Material:

Living Topics (Proto Alpha)

TL;DR: In this hackathon experiment, we examine data describing Swiss government functions, comparing results with Wikipedia and media sources. Some notebooks with initial set-up for machine learning, along with results of crawl, can be found in this repository.

See also:

Hackathon journey

This code project is based on the "Living Topics" challenge, proposed at #GovTechHack23 on March 23, 2023. Here is the gist of it:

The goal of this challenge is to harness the expressivity and freshness of the terminologies provided by TermDat to create a high-quality map of what topics the Swiss Government is currently working on.

Looking at this problem statement, we first have to take a step back: what is the structure of the Swiss government, what is the scope of 'topics', where would you start - in other words, what would be the high-level 1:1000000 map of the administrations?

1:1 Million Map of Switzerland (swisstopo shop)

After some discussion, we came up with a slightly more accessible version of the Living Topics challenge: instead of bottom up - at the current topic levels as originally stated - let us begin at the top level of government, obtain definitions of the functions and responsibilities of government departments. The more detail we have, the better we would be able to classify a topic as belonging to one or another office. From here, step by step we would be able to identify specific current affairs.

Organigram from the 2020 edition, see also 2023 update.

We begin at the top. Helpfully, the Federal Chancellery produces an illustrated guide to the political and administrative system (ch-info.swiss) in Switzerland, available in print, online and in an app. This gives a brief overview to the departments, with some detail of their function. We could unfortunately not find the source code or any way to bulk-download from this website. We keep searching.

172.010.1 Government and Administrative Organization Ordinance

The FEDLEX service provides us the legal documents that serve as the mandated basis for the administrations. We find the interface clumsy, and the document layouts not machine-readable. Even when we export the XML version, we get impractical HTML tables inside. Nevertheless, our discussion leads us to explore the State Calendar as an alternative source of hierarchical structure, which leads us to quickly updating a long overdue public bodies open data source.

What does 'the Internet' have to say about all this?

Screenshot of ChatGPT by OpenAI.

Hmm, wonder where 'the Internet' gets this data from?

Wikipedia Screenshot of four language editions (EN-IT-FR-DE) of a Wikipedia article.

The Wikipedia page Federal administration of Switzerland provides a similar overview. We found that the very complete content in the German edition to be somewhat out of date, the English language nearly as complete, the French significantly shorter, and Italian practically empty. Using the Mediawiki API - also via handy Python wrapper - it is possible to quickly get the contents of Wikipedia pages. And in an Edit-a-thon, we could update them and improve the translations.

Screenshot of successive Linguee.com searches.

What else could we try? A series of searches on Linguee (a dictionary service that is part of DeepL) provided some clues about various government websites and media repositories describing responsibilites of the federal, cantonal and municipal government.

Screenshot of Nicht Sache der Kantone (NZZ 2009).

Finally, we explore the media landscape. At other hackathons like the recent Rethink Journalism event, we had a chance to work with press databases - some of which would be excellent resources to understand expectations and questions about the function of government from the outside in. We leave this avenue for a future foray, though we trust that the web services of the Confederation would be the best starting point.

Which brings us to the point of departure of the hackathon - the I14Y Interoperability Platform. We decide to use the API of TERMDAT to in sequence understand the main levels and units of government, though all three of the available endpoints, like News Service API, are interesting:

Screenshot of I14Y Interoperability Platform

Continuing with the questions we explored above, we first explore the relatively straightforward web interface, punching in some test searches, that seemed to give promising even if limited results:

Screenshot of TERMDAT

It becomes clear that we would need to be very precise, and correct, in our queries. First, we create a simple folder structure: bund (Federal), kantone (Cantonal) and gemeinde (Municipal) for the three levels of government. Then bk, uvek, edi ... for the main government departments. In these folders we can put text files (termdat.txt, wikipedia.txt, ..) that help us to create a classifier for topics related to these departments.

We write a simple aggregator to repeatedly query the TERMDAT API and save the descriptions (or any available notes) about the departments into these folders. One of the issues we experienced were minor inconsistencies in the data schema (missing description fields), which our code works around.

Screenshot of API docs, Jupyter notebook, search results.

At this point, we look into the question of how to best classify these texts. Using a Sentence Similarity model like gBERT-large-sts-v2, which has a fine-tuned version by Deutsche Telekom, we can utilise a cloud-based API - or run our own inference service to work out the appropriate department. We have some initial code, but could not get results until a few hours after the deadline.

Screenshot of sentence-transformers notebook

We are nevertheless motivated to continue on this idea, and would be happy to hear feedback & suggestions via GitHub Discussions.

License

MIT

This content is a preview from an external site.
▲▲▲

Repository updated

26.03.2023 22:58 ~ loleg

Event finished

24.03.2023 16:00

Hallo, Welt

Initial commit

Get

24.03.2023 13:32

Saw a GitHub error today? Read security notice, rotate your keys

24.03.2023 13:32 ~ loleg

Find

24.03.2023 13:30

We worked with the TERMDAT API, found minor inconsistencies, got help from Raphaël (BK) to collect the data we needed.

24.03.2023 13:30 ~ loleg

Ask

24.03.2023 11:06

Joined the team

24.03.2023 11:06 ~ loleg

Event started

23.03.2023 09:00

Edited content version 7

17.03.2023 16:19 ~ l00mi

Edited content version 5

17.03.2023 16:19 ~ l00mi

Joined the team

17.03.2023 16:16 ~ l00mi
▲▲▲
Alle Teilnehmer*innen, Sponsor, Partner, Freiwilligen und Mitarbeiter*innen unseres Hackathons sind verpflichtet, dem Hack Code of Conduct zuzustimmen. Die Organisatoren werden diesen Kodex während der gesamten Veranstaltung durchsetzen. Wir erwarten die Zusammenarbeit aller Teilnehmer*innen, um eine sichere Umgebung für alle zu gewährleisten. Mehr Details befinden sich in die GovTech Hackathon Guidelines.

Tous les participant-es, sponsors, partenaires, bénévoles et collaborateurs/collaboratrices de notre hackathon sont tenus d'accepter le Hack Code of Conduct. Les organisateurs feront appliquer ce code tout au long de l'événement. Nous attendons de tous les participants qu'ils coopèrent afin de garantir un environnement sûr pour tous. Pour plus de détails, veuillez consulter les Guidelines du GovTech Hackathon.

Creative Commons LicenceDie Inhalte dieser Website stehen, sofern nicht anders angegeben, unter einer Creative Commons Attribution 4.0 International License | Le contenu de ce site web est, sauf indication contraire, sous licence Creative Commons Attribution 4.0 International.

GovTech Hackathon 2023