By Daisuke Wakabayashi
Feb. 23, 2017
NOTICE: THIS WORK MAY BE PROTECTED BY COPYRIGHT
YOU ARE REQUIRED TO READ THE COPYRIGHT NOTICE AT THIS LINK BEFORE YOU READ THE FOLLOWING WORK, THAT IS AVAILABLE SOLELY FOR PRIVATE STUDY, SCHOLARSHIP OR RESEARCH PURSUANT TO 17 U.S.C. SECTION 107 AND 108. IN THE EVENT THAT THE LIBRARY DETERMINES THAT UNLAWFUL COPYING OF THIS WORK HAS OCCURRED, THE LIBRARY HAS THE RIGHT TO BLOCK THE I.P. ADDRESS AT WHICH THE UNLAWFUL COPYING APPEARED TO HAVE OCCURRED. THANK YOU FOR RESPECTING THE RIGHTS OF COPYRIGHT OWNERS.
SAN FRANCISCO — From self-driving cars to multi-language translation, machine learning is underpinning many of the technology industry’s biggest advances with its form of artificial intelligence.
Now, Google’s parent company, Alphabet, says it plans to apply machine learning technology to promote more civil discourse on the internet and make comment sections on sites a little less awful.
Jigsaw, a technology incubator within Alphabet, says it has developed a new tool for web publishers to identify toxic comments that can undermine a civil exchange of ideas. Starting Thursday, publishers can start applying for access to use Jigsaw’s software, called Perspective, without charge.
“We have more information and more articles than any other time in history, and yet the toxicity of the conversations that follow those articles are driving people away from the conversation,” said Jared Cohen, president of Jigsaw, formerly known as Google Ideas.
Unless carefully managed, discussion in comments sections often devolves into a hateful exchange. This has prompted some publishers to turn off the comments section because moderating them can be time-consuming.
With machine learning, a computer system is programmed to learn from repetition. It takes in training data — essentially, example after example — until it is familiar enough to anticipate with a high degree of confidence the proper response.
In this instance, Jigsaw had a team review hundreds of thousands of comments to identify the types of comments that might deter people from a conversation. Based on that data, Perspective provided a score from zero to 100 on how similar the new comments are to the ones identified as toxic.
Jigsaw said it settled on the word toxic after finding that most reviewers shared views about what types of comments drive people away from a conversation. Opinions about what comments constituted, for example, a personal attack ranged widely.
The same methodology is being provided to publishers, who could use the scores to have human moderators review comments only for responses that registered above a certain number, or allow a reader to filter out comments above a certain level of toxicity.
Jigsaw worked with The New York Times and Wikipedia to develop Perspective. The Times’s comments section is managed by 14 moderators, who manually review nearly every comment.
Because this requires considerable labor and time, The Times allows commenting on only about 10 percent of its articles. The Times said in a statement last year that it made its comments archive available to Jigsaw to help develop the machine-learning algorithm running Perspective. Linda Zebian, a spokeswoman for the Times, declined to comment on Wednesday.
Mr. Cohen said the technology was in its early stages and might flag some false positives, but he expected that it would become more accurate over time with access to a greater set of comments.
Jigsaw, whose stated mission is to use technology to tackle “geopolitical challenges” such as cybersecurity attacks and online censorship, said it also saw opportunities for its machine-learning software to identify comments that are off-topic or unsubstantial.
A version of this article appears in print on February 24, 2017, on Page B2 of the New York edition with the headline: Developing Online Tools to Flag Toxic Comments.