Automatically Identifying Writing Strategies for Science Communication

A step towards automated tools to support science writing.

This blog post explains our paper, Writing Strategies for Science Communication: Data and Computational Analysis which was published this year in the Conference on Empirical Methods in Natural Language Processing (EMNLP). If you’re interested in the technical details of the work, feel free to check out our paper or the project’s github repo!

The problem

It’s important that the public gets trustworthy, understandable information about new scientific findings. But have you ever tried reading a scientific paper or talked to a scientist about their research? Chances are you and the scientist both left with a headache!

What we did

To combat this communication issue, we used modern natural language processing (NLP) techniques to automatically identify writing strategies for science communication. Our goal is to eventually build tools that give automatic feedback to writers for using strategies in their own writing to best communicate with their audience.

Two paragraphs of text, highlighted with different colors to specify the strategy used in it.
Example of our annotations of the writing strategies.

What we found

We found that there were exciting differences between how venues used the strategies. For example, press releases highlight the real world impact of research more than scientific magazines. Magazines in turn use less specialized jargon and incorporate more storytelling, active voice, and present tense into their sentences.

PhD Student at UW exploring ways of building writing tools to tailor language to better engage different audiences.