
How does AI work and train
I think everyone at UB, along with millions of creators and readers, hates AI. It's an undisputed fact that companies and corporations feed data, legally and illegally, to their software and systems, which leads to copyright infringement. This also makes the output of AI-generated content a source of debate: does it violate copyrights, and, can that output itself be copyrighted?
AI - HOW DOES IT WORK?
AI - HOW DOES IT WORK?
The giant (often illegally or at least unethically fed) hivemind of AI gains its knowledge through corporations feeding it text from books and other publications, or “training,” into their API (application programming interface), as well as feeding it information sourced from websites and public datasets. This means places like Facebook, Substack, eBay, Amazon, Google, Goodreads, etc. allow AI APIs to collect data (and you will often hear the word "scrape" being used for this "collection").
Private companies may enter data to enhance services to their employees. Example: Hubspot helps salespeople read data to help generate more sales.
BUT…a normal, everyday person does not have access to training APIs for AI.
Front-end user interfaces like ChatGPT or AI checkers are not open platforms for inputting training data, except for what they call "session-based" information.
Session-based information is stored during a specific session, then it is wiped from memory when the session is complete. This means a person could have a conversation with AI, asking for help writing a paper (or book, ahem) in one “session.” The next day, ChatGPT will not have any recollection of yesterday’s conversation if that person starts a new session.
Using ChatGPT, for example, I could not train AI about the newest book it is helping me write called 50 Shades of Bad AI. Therefore, it would not (and could not) give D.L. information about my new book before it is published on a website that AI training modules scrape, like Amazon. After it's published? Well, AI might scrape it, unfortunately.
While companies like Meta have illegally or unethically uploaded copyrighted books with excuses like "It will improve quality writing, style, expression, and long-form narration"—that text is not being uploaded on the front end of ChatGPT, which is all a common user has access to.
Recently, readers/authors have been eager and quick to accuse the everyday person of “training AI” when they use a front-end service to verify if a book was written in AI, or, as teachers do, to check whether a student's work is their own, or AI generated. While doing this could lead to a lot of false positives, it does not train the global AI technology, because that's just not its functionality.*
We believe that understanding the facts about AI and its technology will help us unite against the use of generative AI in the creative arts, to keep more dollars in the pockets of real authors and artists, and send a message to the mega corporations that we won't accept this widespread theft of our intellectual property and our hearts' work.
*At least in July of 2025, when this blog post was written. We acknowledge that technology changes fast, especially for those who have more money than they know what to do with... For now, we are just praying that AI doesn't advance so much that it can storm into our homes and steal the silverware.
*At least in July of 2025, when this blog post was written. We acknowledge that technology changes fast, especially for those who have more money than they know what to do with... For now, we are just praying that AI doesn't advance so much that it can storm into our homes and steal the silverware.