Natural language processing – do you understand the language of the future?
Maybe you have already heard of natural language processing, or NLP for short. For many, however, it is still an unfamiliar term. In fact, you probably already use various forms of NLP in your day-to-day life without being aware that you are doing so – for instance when you use Siri, Google Assistant or Alexa.
But what is behind NLP and why will this core discipline in information procurement continue to gain importance?
We asked Professor Manfred Vogel from the University of Applied Sciences and Arts Northwestern Switzerland (FHNW) our most burning questions on the subject of NLP. He is Head of Information Processing at the Institute for Data Science at the FHNW and an expert on NLP.
1. Professor Vogel, let’s start with a simple question. What does the abbreviation NLP stand for and what is it all about?
LP stands for “natural language processing”, which is the processing of natural spoken and written language with computers. It is about the automatic comprehension and generation of language, the acquisition of data and information from words and finally about the interaction between humans and machines.
2. Why is NLP so relevant today and what is it needed for? Can you give a practical example of how it is used?
The quantity of data humans are producing is exploding, and the majority of this data is unstructured, i.e. in the form of text. To be able to process this flood of data, NLP methods are indispensable. Let’s take health insurance reimbursement claims as an example. Around 100 million of these documents are processed every year in Switzerland, and in most cases this still requires manual work. In the future, routine tasks like this will be performed by intelligent, NLP-based systems. A great deal of office work in general will one day be automated; NLP will play a very important part in this and have a decisive effect in shaping Office 4.0.
3. How much is “maths/numbers” and how much is “language/letters” in NLP? In other words, which proportion of NLP is effectively made up of language and which of mathematics?
Today’s NLP models are based on machine learning methods and maths does indeed play a very important role in them. Fortunately, however, there are many software libraries and frameworks that will take care of the most difficult mathematical operations for you. You don’t have to be a mathematician to do NLP, but a good understanding of maths is required to create useful models or carry out analyses.
4. The work with NLP is rapidly developing into a popular professional activity. How and where is it possible to learn NLP and which skills do you need to bring with you?
Universities recently started teaching NLP, both at Bachelor’s and Master’s level. Specialised further training courses are also available. And what is particularly important: there are a number of good online courses and tutorials that are either free or cost relatively little.
5. What are you researching at the moment? Which problems are there with regard to the development of NLP? Which challenges do you encounter? What is the current status of the research and what potential do you see in NLP?
Our institute works on machine learning and deep learning projects in general, very often in collaboration with business and industry partners. We focus on various topics, such as the automated processing of documents for accounting systems, health insurers, tax returns, customs documents, etc. In collaboration with SwissGlobal Language Services AG, we are also developing a proprietary transcription engine that supports the transcription of speech recordings (audio) and translations between various languages and enables us to optimise them with sector, customer or project-specific corpora.
A special project that we are working on in collaboration with the ZHAW and the University of Zurich is the development of a speech-to-text system that translates spoken Swiss German into High German text. The particular difficulty with this is getting hold of the necessary data (audio recordings with transcriptions) as there is no standardised written version of Swiss German and the language has many dialects.
6. What added value does NLP generate for a language services provider like SwissGlobal? Which potential benefits and opportunities will SwissGlobal’s customers be able to enjoy thanks to NLP?
In principle, NLP can relieve practically every company of laborious and tedious routine tasks and perform them to a high standard, be it processing documents or analysing speech recordings. This enables them to offer more affordable products and services at the same or even a higher standard of quality than manual processing allows and increases the production speed.
7. Finally, can you tell us something about Innosuisse and the SwissGlobal Transcription and Translation Supporting Engine project?
Innosuisse enables small and medium-sized companies – and in particular start-ups – to realise innovative development projects. The principle is that in terms of cost, the company must invest at least as much (in labour and/or money) as Innosuisse, with the latter funding the research and development activities of the university partner. With the Transcription and Translation Supporting Engine project, which is being conducted by the FHNW Institute for Data Science in collaboration with SwissGlobal Language Services AG, the proprietary engine guarantees data security and confidentiality, which is not the case with general language services such as Google Translate and DeepL. The quality is also much better, as the engine receives domain-specific training.
Interested in finding out more about the development of NLP? Are you even considering doing some training or a qualification in the field of NLP?
The FHNW will be happy to provide you with further information. A description of the degree module in NLP is provided here (in German).
Or would you like to optimise your language projects with the help of state-of-the-art language technologies? Then contact our SwissGlobal team for a no-obligation consultation.