Authors furious as ‘wildly unethical’ AI website hoovers up books

We’re sorry, this feature is currently unavailable. We’re working to restore it. Please try again later.

Advertisement

Authors furious as ‘wildly unethical’ AI website hoovers up books

By Nell Geraets

A website that used artificial intelligence to analyse tens of thousands of novels has been shut down after authors criticised it for using their work without permission or compensation.

Prosecraft, which was created by computer scientist Benji Smith in 2017, was intended as a database to help aspiring writers improve their work by comparing aspects of their prose to the work of authors they admire.

Holden Sheppard, author of Invisible Boys and The Brink, realised his work was added to Prosecraft without consent.

Holden Sheppard, author of Invisible Boys and The Brink, realised his work was added to Prosecraft without consent.

More than 25,000 works of fiction by more than a thousand different authors were scraped using AI-generated linguistic algorithms. Many of the authors whose work featured on the site argue they did not consent to their intellectual property and copyrighted material being fed into the database.

Writers, like many other artists, are concerned their work is being used to train AI models without compensation, and that their prose could be used to create new material that is sold for profit without their involvement or approval.

Australian writer Holden Sheppard discovered his novel, The Brink, was on Prosecraft on Monday, when someone alerted him to it on X (formerly Twitter).

“I didn’t even know what Prosecraft was,” Sheppard said. “I was absolutely furious. I spent eight and a half years of my life writing that book. It was one of the toughest things I’ve ever had to do ... So to suddenly see my work has been used in a way that I’m not OK with filled me with rage.”

Of particular concern to Sheppard is that material from Prosecraft was being used to train Smith’s writing software program, Shaxpir, which has not been shut down (meaning data from Prosecraft remains accessible).

“We don’t know where that [data] is going to be fed. It could be on-sold to ChatGPT or [Google’s] Bard or anyone else to generate made-up books using all my words. They could either profit off it, or they could ruin my reputation,” Sheppard said.

Advertisement

On Monday, author Jane Friedman (The Business of being a Writer) discovered about half a dozen AI-generated books were being sold on Amazon under her name.

Loading

She said she was initially told the titles would not be removed because she didn’t own the AI books’ copyright, and her name was not trademarked. However, Amazon removed the books on Wednesday.

Devin Madson, the Australian author of We Ride the Storm, said data-scraping isn’t necessarily bad, but that a lack of transparency was exacerbating issues of bias already apparent within publishing.

“The publishing industry already has a problematic culture of requiring books and authors to look a certain way and tell certain stories. That’s a culture we have been trying to push back against for years,” they said.

As backlash intensified, Smith announced he would take Prosecraft down. On Tuesday, he apologised to those who felt their work had been unfairly used, but defended the site by stating it had never generated income and the risks of AI were not as apparent when it was launched.

“I researched copyright laws, mindful of not wanting to hurt or offend the community of authors that I cared so much about,” Smith wrote.

“Since I was only publishing summary statistics, and small snippets from the text of those books, I believed I was honouring the spirit of the Fair Use doctrine, which doesn’t require the consent of the original author.”

Regardless, it has sounded an alarm that should concern readers, said the Australian Society of Authors chief executive Olivia Lanchester.

“While Prosecraft copied books – which raises serious copyright infringement issues – it isn’t actually generative,” she said.

“The bigger concern is global AI companies such as Google, Microsoft and OpenAI who have used authors’ works to train their generative AI models, feeding millions of literary and artistic works into their machines to produce a product that may undermine the jobs of the people whose works they’ve exploited.

Loading

“Despite their work being essential to train and launch Generative AI tools such as ChatGPT, tech companies haven’t sought a licence or offered payment ... The first thing we need to do is stop accepting theft of creative work by big tech on the assumption that the benefits of AI justify any means.”

Melbourne queer fiction writer and poet, Emma Osborne, said the appropriation of their work by AI commercial entities without compensation is threatening the possibility of earning a sustainable income through creative writing – an industry whose members are already underpaid.

“It is wildly unethical,” Osborne said. “Prosecraft and similar sites are normalising theft.

“Aside from this, AI tools that spit out prose are a complete mockery of creativity – writers, like artists from other realms, practise their craft for years, if not decades. To see a site mash together words and spit out meaningless stats or straight up snippets of prose is heartbreaking.”

AI regulation remained lacking, Sheppard said. Government policy and industry codes of practice are required to ensure writers of the future can consider publishing as a viable career.

“We have hit the iceberg, the Titanic is sinking. Everyone is drowning,” Sheppard said. “So, we look to the government, and they’re like, ‘cool, let’s look at a map of icebergs’ – that’s not helpful. We need regulation yesterday.”

Find out the next TV, streaming series and movies to add to your must-sees. Get The Watchlist delivered every Thursday.

Most Viewed in Culture

Loading