Skip to main content

Natural Language Generation: Using AI in Content Generation

Thanks to Sebastian from Retresco for sharing!

Artificial Intelligence (AI) is a hot topic. But have you heard of Natural Language Generation (NLG)? You might have already encountered it in apps and websites that have chatbots or smart assistants. Many of these generate language on the fly based on AI within those applications. More and more websites are also using automatic text generation to provide targeted, personalized, and other content. NLG will soon be available for the enterprise open source TYPO3 CMS, too, thanks to an integration with Retresco’s textengine.io

Sometimes called “robot journalism,” news portals use NLG software to produce sports news, weather reports, or stock market updates. Using the same technology, real estate portals and online merchants create listings and product descriptions for eCommerce systems. Adding NLG capabilities to TYPO3 enhances its already potent abilities to manage structured data and deliver large, performant, information-rich websites and applications.

80% data, 20% magic - How does NLG work?

“Automatically generating natural language texts isn’t magic,” stresses Retresco’s Bernd Bretz. “80% of it is having good, structured data … and maybe 20% is magic,” he jokes.

Good examples of NLG systems that make sense are those that involve a lot of structured, semantic data, like for weather reporting or sports results. Large quantities of primary data such as location, names, common occurrences, and so on, are added to a generator’s database. The structure of the content that is to be generated is determined by even more data: templates and conditions. Templates are essentially gap texts with a large number of variants, synonyms, adverbs, and other lexicon entries. Conditions are the circumstances that must be fulfilled for a particular template to be used. To automatically generate content, the system determines a relevant condition, combines generic information with event-specific data and uses intelligent linguistic analysis to produce the final product.

The software engine also knows how to arrange the templates in a given order, known as the ‘story plot’ or ‘narrative.’ Human editors need to create templates and define conditions when setting up the generation framework, but once they are in place alongside the necessary data, the system can work independently.

The potential of NLG

NLG opens up entirely new possibilities for websites. Using text generation, you can have specialized or personalized content created in real time for your users and scaling is only limited by the capacity of your infrastructure. NLG software can even be used to create content in many languages—either translating an original source-text (also automatically generated) or providing the relevant customizations needed for a given region or market.

News-driven portals can expand their coverage of niche topics with NLG, increasing their reach. Online retailers can quickly and easily adapt content and update their catalogs when the need arises, such as responding to conversion optimization data or promoting seasonal campaigns.

Automatic content creation does have its limits. NLG makes little sense, for example, where background reports or expert opinions on specific topics are required. However, it does allow you to free your human copywriters and editors from repetitive, low-value tasks and deliver the critical, high-value results that you need from them.

NLG applications for every website?

NLG need not be limited to specific industries or only used on high-traffic websites. In principle, any organization, public institution, or company of any size that possesses structured data sets can use these applications to automate content creation for their web presence and applications to connect it to their users and stakeholders.

Mature NLG software can be offered as a Software-as-a-Service (SaaS) solution that any CMS can connect to. The applications are becoming more and more intuitive at the same time, and no programming knowledge is necessary to set them up and benefit from what they have to offer.

The soon-to-be-released integration between Berlin-based Retresco’s Natural Language Generation platform textengine.io and TYPO3 CMS will soon make this particularly easy. You’ll be able to control it all directly in the TYPO3 admin backend.

This article is the first post in a short series about NLG. Stay tuned for more on NLG best practices, automated content generation, and more on the strengths of TYPO3 + textengine.io in our next posts.