Leveraging Text-to-Speech and Screen Reading Tools

August 27, 2020

by Anita McCauley

This post is co-authored by Anita McCauley in the Center for the Advancement of Teaching; Eudora Struble and Jonathan Milam in Information System’s Office of Technology Accessibility; and Amy Archambault in the Office of Online Education.

image of black headphones surrounded by a keyboard and mouse on a white background. Photo by Tomasz Gawlowski on Upsplash.

Photo by Tomasz Gawłowski on Unsplash.

As we move into the Fall 2020 academic semester, one of the few things known with certainty is that significant learning will take place in the digital realm, accessed via computers, tablets or smartphones, and viewed on a screen. Whether in fully online, blended, or socially-distanced face-to-face modalities, independent and collaborative learning using digital tools in virtual spaces will be central elements of the student learning experience. In addition, students may be asked to spend hours in synchronous lectures and discussions that require viewing a screen. Imagine this worst-case scenario for a student taking 15 credit hours: all their courses require synchronous sessions, an electronic textbook or readings, working problems in the publisher platform, contributing to discussion forums in the learning management system, taking quizzes, and collaborating with peers in Google docs. The result — this could translate into at least 45 hours of screen time a week just for their courses. (Note, this estimation is based upon 15 hours in class plus 2-3 hours outside class for every hour in class each week).

Now, imagine the above scenario for two different students: one with a documented visual disability and one learner without a disclosed visual disability. The student with the visual disability may engage with those materials using assistive technologies, adaptive strategies, and materials delivered in alternative formats to ensure access to learning, but those students without disclosed disabilities may also find their learning enhanced through engaging with certain assistive technologies and alternative ways of accessing course content. Providing accessible materials and alternative ways to engage with course content can increase opportunities to learn for all students and create a more inclusive and equitable learning environment.

Listen to the above text as read by Windows Narrator “David”:

Accessibility Makes Learning More Inclusive and Human-Centered for All Students

The educational literature is full of evidence and strategies that call educators to incorporate student-centered, sometimes called learner-centered, course design and learning activities into our pedagogy. Human-centered design takes this a step further. Instead of planning our teaching as if robots inhabited our classrooms, human-centered design asks us to think about the people who are engaging in this learning journey with us; it involves developing empathy with the people you are designing for, and keeping their needs at the heart of the design process.

At the forefront of this mindset is the use of universal design principles. CAST, an educational research and development organization, has created an Universal Design for Learning (UDL) framework to improve and optimize teaching and learning for all people based on scientific insights into how humans learn. One aspect of this framework relates to how learners recognize or receive information and is called the “what” of learning — the ways in which the content essential to learning is presented to students. Multiple modalities for representation of information can give learners distinct, additional ways to interact with the content and can increase the amount of overall time they spend “on task”. One example of leveraging multiple modalities for representing course information and content is reading an assigned text and/or listening to the same text being read aloud.

All the students in your class, not just students with disabilities, can benefit from using multiple modalities to engage with course content. Here are just a few examples of ways that text-to-speech engagement can support and enhance the student learning experience:

Provide more equitable access and engagement with course content for students with diagnosed and undiagnosed learning disabilities.
Provide alternative ways to interact with course content to combat screen fatigue.
Enable mixed modality exposure to course content.
Increase the opportunities for students to spend more time engaging with course materials.

Listen to the above text as read by Apple VoiceOver “Alex”:

Tools to Make Screen-Based and Visual Content More Accessible to More Students

Some students rely on screen readers and text-to-speech tools for engaging with course materials and websites, and others may selectively use text-to-speech to enhance comprehension, or to provide an alternative to visual reading and browsing to reduce fatigue or reinforce learning. Text-to-speech tools offer synthetic machine-generated speech to read accessible digital content. Text-to-speech capabilities are built into many common systems, such as MacOS, Windows, iOS, and Android, and can be used with accessible text to produce audio-supported reading.

Keep in mind that these tools require practice in order to become adept in using them, they do not operate with a simple start/stop/pause/rewind style interface, and function variably in different environments. Many tools offer their best support for English language, and synthetic speech will always reflect the machine-generated nature of the voices, and not sound like a recorded human voice like what one might hear in an audiobook.

Here are some potential tools for text-to-speech, integrated into common systems. These links will get you started turning on the functionality and be a launch pad for further exploration of these tools.

Mac computers: VoiceOver
Windows computers: Narrator
iPhone: Accessibility including VoiceOver
Android: TalkBack
- TalkBack for Docs
iPad: VoiceOver
Adobe Reader: Read out Loud Text-to-Speech Tool
EBSCO Library Database: text-to-speech how to for HTML articles

VoiceThread, and VT Universal, is another academic technology that allows students to engage and participate in course content through multiple modalities of information representation. VoiceThread allows documents, slides, images, and videos to be assembled together in a sequential thread which instructors and students can narrate, insert comments and questions, and reply to each other using video, audio, and text modalities. For an assistive technology user, especially for a screen-reader reliant student, VoiceThread will be accessed through the VT Universal interface. VT Universal introduces some additional steps and effort for navigation and use, when compared with the standard system, and so may take some trial and error to get navigation streamlined.

Listen to the above text as read by Android TalkBack “Google TTS English US Voice I”:

Enhancing Instruction for Text-to-Speech Engagement

Instructors can support student use of text-to-speech tools by doing the following:

Be verbally explicit about anything shown visually on your recordings: steps of navigation, words, images, etc.
Have options to download accessibly-designed ppts or other documents for all students to aid learning and give options for engagement.
Share content in formats that are most accessible for assistive technologies, like accessible Word documents or accessibly-designed web content.
- PDFs may more frequently introduce challenges for screen reading and text-to-speech tools, depending on formatting and tools used.
- Try to avoid sharing images embedded with text. However, if this is unavoidable in a specific case, ensure the text message is shared alongside the image.
Ensure alt text is provided for images that conveys the intended meaning of the image. If a platform or source does not allow alt text to be added, add an image description adjacent to the image. Learn more about accessible images and alt-text.
Provide closed captions (CC) whenever possible for video recordings. In addition to giving students captions to reinforce learning, captions can be made into a transcript text file to allow students to review the dialogue from a class via the text document.
- The Closed-Captioning (CC) feature in VoiceThread can be used to create auto-captions in recordings, which can then be corrected to ensure accurate captions.
- Other tools can also support an instructor adding captions to videos. Consider learning more by attending an Introduction to Captioning PDC workshop.
Encourage ALL your students to take advantage of multiple modalities to receive and interact with course content.

When text-to-speech tools are used on compatible digital content, students might:

Listen to a text-to-speech tool to review their notes while out for a walk, or walking to class
Preview or review assigned content by listening while they are riding the shuttle, or cooking dinner, or running errands.
Listen to the text-to-speech tool and take notes on paper in order to take a break from viewing a screen.
Use tools to increase or decrease the reading speed in order to allow flexibility for specific needs or contexts.

Listen to the above text as read by Windows Narrator “Zira”:

Limitations of Experiencing Synthetic Speech

While text-to-speech and screen-reader tools are generally available and may be beneficial, there are limits in the use and experience of synthesized speech. There are humanized sound characteristics that cannot easily be conveyed. This may be most evident when synthesized voices are used to read documents and websites in different languages or with technical terminology. Synthesized speech has typically been thought to lack tone and emotion, though over the years this has improved greatly, resulting in higher quality voices that sound less robotic and increasingly human-like. Also, although a person reading aloud may interpret or improvise from the written text, with synthesized speech, the computer will read the words exactly as they appear and will not be able to improvise or correct errors. Finally, precise pronunciation can also be an issue. Speech synthesizers use computer code that combines different fragments of phonemes, which are the sounds or groups of different sounds perceived to have the same function by speakers of a language or dialect. As a result, not all words will be perfectly pronounced. As technology continues to develop, the human-like qualities of synthetic voices will likely also improve.

When possible, providing human-read versions of screen or text content can provide an equitable and authentic alternative representation of course content that benefits all students. In his video, 10 Online Teaching Tips Without Zoom, Michael Wesch, an anthropologist and experienced online teacher at Kansas State University, emphasizes the importance of providing content in multiple modalities. In Tip #7: Don’t Waste Their Time, and Tip #8: Read to Them (video time: 07:44 – 08:57), Wesch talks about making master mp3 files of the course content for each unit or module. Students can then download and listen to that content when and where it works for them. He also emphasizes the value of creating recordings of himself reading course texts aloud. In addition to making the texts more accessible, this also creates the opportunity to pause and insert commentary, demonstrate critical thinking, make explicit connections to other content, and ask questions. This type of interaction with the text transforms how students engage with the readings and models for them how an expert interrogates, reflects on, and integrates ideas to make meaning. These recorded readings are also an effective way to convey enthusiasm and passion for the course.

Admittedly, creating these human-voiced alternatives to physical texts, screens, and synthetic screen readers is a time-consuming task. However, once made, these resources can be used many times in the future. One approach might be to create a master mp3 file, or a recorded reading of course texts, for just one topic in your course. Then, build out your library of recorded files during future semesters. In order to ensure fair use, these recordings should only be used by the instructor that made them. In all cases, remember to consider copyright and fair use issues when considering if and how to share recordings of course material. Learn more by visiting blog resources such as Who Owns Course Materials: Courses & Copyright Conundrums and reaching out to Molly Keener for more support. To learn more about assistive technologies, text-to-speech and screen-reader tools, or universal design for learning, please reach out to us. We’d love to talk and brainstorm ways you can make the content in your course more accessible and human-centered!

Listen to above text as read by a human voice:

Receive CAT remote teaching updates in your inbox.

Center for the Advancement of Teaching

Accessibility Makes Learning More Inclusive and Human-Centered for All Students

Tools to Make Screen-Based and Visual Content More Accessible to More Students

Enhancing Instruction for Text-to-Speech Engagement

Limitations of Experiencing Synthetic Speech

Subscribe

Recent Posts

Archives

Where to start

Get to know WFU

Resources

Support Wake Forest

Wake Forest Giving Societies