Jump to content
The Education Forum

Beware AI


Recommended Posts

Folks,

Unless info posted here has a verifiable source that you can check, a book, a magazine, a document, preferably with a RIF number, don't trust it. 

I asked a question and Bob Ness ( if that is his real name, if that is a real person ) posted something he got from ChatGPT, an AI thing.  

We have enough fiction to deal with.  

Joe

Edited by Joseph Backes
Link to comment
Share on other sites

16 minutes ago, Joseph Backes said:

Folks,

Unless info posted here has a verifiable source that you can check, a book, a magazine, a document, preferably with a RIF number, don't trust it. 

I asked a question and Bob Ness ( if that is his real name, if that is a real person ) posted something he got from ChatGPT, an AI thing.  

We have enough fiction to deal with.  

Joe

One thing I’d like to try is to feed a ton of JFK records into an embedding database like Pinecone and use OpenAI as a search engine. I’m not sure how well it would work, and for complex reasoning across disparate records you’d probably have to fine-tune it which isn’t available on the best GPT models, but I think it’d be a worthwhile experiment. 

Imagine if we had something as good as GPT4 that was fine-tuned on the entire JFK Collection though…I wonder how NARA’s coming along with the 2021 order from Biden to digitize the entire ARC…

Link to comment
Share on other sites

Unless it tells you where it got its info from it's useless.  It has to specify here, here, and here.  And then you need to go there, there and there to verify that the info was real.  

People think it's a magic button that will do the reading and thinking for you. That AI will solve the case.

Nope.  

It's the next big stupid Ponzi scam.  Technology years ahead of the law, again.  Before it is regulated or banned billions will be made and lost. 

Link to comment
Share on other sites

2 minutes ago, Joseph Backes said:

Unless it tells you where it got its info from it's useless.  It has to specify here, here, and here.  And then you need to go there, there and there to verify that the info was real.  

People think it's a magic button that will do the reading and thinking for you. That AI will solve the case.

Nope.  

It's the next big stupid Ponzi scam.  Technology years ahead of the law, again.  Before it is regulated or banned billions will be made and lost. 

With vector embeddings you could do that, have it return the sources of its info and associated metadata, which you could organize in your db by RIF number, etc. I’m sure you could incorporate RIFs and page numbers into a tuned/trained AI too but not sure how that would work exactly.

It would all depend on how you structured the input data. Ideally you’d have a dataset like the NARA database with an additional column containing all the text of the associated record. You could even extend it to page number but either way it would be a massive pain the ass to code if you had to scrub all the text from OCR enabled .pdfs, or something like that. 

AI is certainly not going to solve the case, but it could be very useful as a semantic search-engine. For example, you could ask it something like: “Describe all the evidence supporting that a man ran out the back of the TSBD and jumped in a car after the assassination. Respond in table format with columns: witness, date, interviewing agent”, and you could configure the responses to include RIF and page numbers from where the data was retrieved. 

This sort of thing would save a hell of a lot of time compared for example to searching MFF for every record containing “rambler and depository”, “car and man and running and building” and similar queries. 

Link to comment
Share on other sites

I'm sorry but I'm too skeptical.

We are dealing with documents that are severely degraded, extremely poor and often illegible photocopies, information made on every imaginable government form.  In addition to handwritten documents, antiquated computer formats; audio in many formats; film, video and photography in many formats and that's just the stuff in English.  

Asking a new computer system to search through all that as though it's all already in a legible standardized format is very hard to believe. 

Joe

 

Link to comment
Share on other sites

1 hour ago, Joseph Backes said:

I'm sorry but I'm too skeptical.

We are dealing with documents that are severely degraded, extremely poor and often illegible photocopies, information made on every imaginable government form.  In addition to handwritten documents, antiquated computer formats; audio in many formats; film, video and photography in many formats and that's just the stuff in English.  

Asking a new computer system to search through all that as though it's all already in a legible standardized format is very hard to believe. 

Joe

 

I'm skeptical as well. Would a computer program be able to assess what is a credible source vs a non-credible source? I mean, there's all sorts of Oswald sightings no one takes seriously. Would a computer program try to make them all fit, where Oswald was in one state one day and another the next, like a traveling salesman? Or, what about the medical evidence? Many researchers make the mistake of relying upon what one witness said, as opposed to what all the witnesses said. Would a computer program know not to trust latter-day outlier statements made by octogenarians? Or would it give all statements equal weight no matter when they were made, and no matter how peripheral the witness? 

My point is that with any presentation of evidence, the sorting and presenting of the evidence can determine the viewer's response to the evidence. When I first created my website, google sorted by number of views. As a result, I could post something here and tell people to read more on my website, and a few days later my website would come up near the top of a google search of the topic. But then google--probably as result of complaints from mucky-mucks--started weighing views based on whether or not "google" (not really google but someone working there who probably knows next to nothing about anything, along with a computer program) found a site credible. At that point, searches which used to lead one to this forum or my website got rerouted to The NY Times or egads! John McAdams' site, etc. 

So...beware. If a program emerges in which you can access hundreds of JFK books and articles and then have summaries on certain subjects written by the program, dollars to donuts the program will have a human element--and it will be the same kind of human that is currently controlling google and wikipedia, etc. 

Edited by Pat Speer
Link to comment
Share on other sites

1 hour ago, Joseph Backes said:

I'm sorry but I'm too skeptical.

We are dealing with documents that are severely degraded, extremely poor and often illegible photocopies, information made on every imaginable government form.  In addition to handwritten documents, antiquated computer formats; audio in many formats; film, video and photography in many formats and that's just the stuff in English.  

Asking a new computer system to search through all that as though it's all already in a legible standardized format is very hard to believe. 

Joe

 

That’s kind of what I meant by saying it would be a pain in the ass. It wouldn’t really work unless the text was already in a legible standardized format. 

I also think it’s unrealistic to rely on an AI for actually analyzing the evidence in any depth, kind of like Pat said. The value IMO would be as a semantic search engine, where you could search through a large number of documents and return information based on concepts instead of specific word matches. 

Link to comment
Share on other sites

9 hours ago, Joseph Backes said:

I asked a question and Bob Ness ( if that is his real name, if that is a real person ) posted something he got from ChatGPT, an AI thing.  

Are you joking, Joe? Calling my identity into question? Maybe you should do your own research on who I am. If you can't figure that out, I do suggest you buy a clue somewhere. I try to do you a favor at my own expense and time, post a warning that the info is not confirmed, and you come back and insult me?

Edited by Bob Ness
Link to comment
Share on other sites

5 hours ago, Pat Speer said:

As a result, I could post something here and tell people to read more on my website, and a few days later my website would come up near the top of a google search of the topic. But then google--probably as result of complaints from mucky-mucks--started weighing views based on whether or not "google" (not really google but someone working there who probably knows next to nothing about anything, along with a computer program) found a site credible.

No. Google changes its algorithm constantly and has to parse through "Black Hat" and "White Hat" search engine optimization strategies to eliminate scammers and spammers gaming the system. What you were doing is called comment spamming and although yours was innocent enough, Google had to crack down on it like it did link farms and various other techniques to boost relevancy scores for organic searches. Unfortunately, many sites get harmed in the process but it's because SEO pros have figured out the weakness and Google responds when sites like yours are displacing other sites that should rank higher.

That's not to say your site isn't legitimate or features bad information but Google's factors for search engine returns and indexing grade comment links lower than say direct links from an authoritative site like the DoJ or Harvard University (FYI backlinks from authoritative sites are probably the most valuable search factor and Google rates them very high). Years ago, SEO's could buy thousands of links for $10 from bogus but real domains and rank sites high due to the amount of backlinks. When Penguin came out thousands of sites dependent on internet income were delisted because they had those links and it was impossible to have them removed. Try emailing 10,000 bogus domains asking to remove the backlinks. Impossible to do it. Many went bankrupt. 

Lecture over hahaha

Edited by Bob Ness
Link to comment
Share on other sites

5 hours ago, Tom Gram said:

The value IMO would be as a semantic search engine, where you could search through a large number of documents and return information based on concepts instead of specific word matches. 

Exactly. Its all in the prompt. The search I did for Joe returned a believable and detailed response, but I wasn't able to confirm it - and I spent quite a bit of time on it. I still don't know whether it is accurate or not but it's well worth a quick look. Google anymore only returns ads.

Link to comment
Share on other sites

3 hours ago, Bob Ness said:

No. Google changes its algorithm constantly and has to parse through "Black Hat" and "White Hat" search engine optimization strategies to eliminate scammers and spammers gaming the system. What you were doing is called comment spamming and although yours was innocent enough, Google had to crack down on it like it did link farms and various other techniques to boost relevancy scores for organic searches. Unfortunately, many sites get harmed in the process but it's because SEO pros have figured out the weakness and Google responds when sites like yours are displacing other sites that should rank higher.

That's not to say your site isn't legitimate or features bad information but Google's factors for search engine returns and indexing grade comment links lower than say direct links from an authoritative site like the DoJ or Harvard University (FYI backlinks from authoritative sites are probably the most valuable search factor and Google rates them very high). Years ago, SEO's could buy thousands of links for $10 from bogus but real domains and rank sites high due to the amount of backlinks. When Penguin came out thousands of sites dependent on internet income were delisted because they had those links and it was impossible to have them removed. Try emailing 10,000 bogus domains asking to remove the backlinks. Impossible to do it. Many went bankrupt. 

Lecture over hahaha

You have reminded me of some of what went down. I received numerous offers back in the day from people/companies offering to optimize my site for a price. I never did that as there was no need. On anything I cared about--say JFK's autopsy--my website routinely came up n the top ten searches. So I wasn't gaming the system or whatever by making comments here, and I was indeed the one victimized by the new policies as in fact there were no sites whatsoever that "should rank higher" Believe it or not, my website remains the definitive source on a number of topics, and google linking to a NY Times article from ten years ago in which something gets mentioned in passing instead of a detailed article on my website in which something is discussed in detail does no one favors, outside perhaps The NY Times. 

Link to comment
Share on other sites

I asked Google Bard for a bio of Otto Skorzeny and it was woefully and provably inaccurate. I’m not referring to the later details dug up for more recent books, just for basic stuff. 

Edited by Paul Brancato
Link to comment
Share on other sites

9 hours ago, Pat Speer said:

You have reminded me of some of what went down. I received numerous offers back in the day from people/companies offering to optimize my site for a price. I never did that as there was no need. On anything I cared about--say JFK's autopsy--my website routinely came up n the top ten searches. So I wasn't gaming the system or whatever by making comments here, and I was indeed the one victimized by the new policies as in fact there were no sites whatsoever that "should rank higher" Believe it or not, my website remains the definitive source on a number of topics, and google linking to a NY Times article from ten years ago in which something gets mentioned in passing instead of a detailed article on my website in which something is discussed in detail does no one favors, outside perhaps The NY Times. 

Your site has between 6,000 and 12,000 backlinks to it many of which are suspicious or Chinese or whatever. You can check it for free at AHREF's backlink checker if you're interested. I don't think it has a site map either and many other tags and whatnot are missing. Google and the other search engines use that stuff to determine what your site is about and what keyword phrases that a user types in should return a page from your site. Your autopsy page I assume is Chapter 10: Examining the Examinations, which the Bots assume is about college testing or some such thing because you haven't given it an appropriate h1 title tag, url slugs or anchor text that includes relevant keywords nor have you provided a description tag for the page.

The search engines use these tags to discourage "keyword stuffing" of content (and the old keyword tag) and rely heavily on relevant words and terms in these other tags to sort through the kajillions of pages they have to index for their users. The length and detail of your content is stellar, no doubt, but you're not presenting your ID to the search engines so they can call your name. In the old days the search engines would list sites for keywords like "Kennedy Assassination" because the content of the site had the term "Kennedy Assassination" on the page a thousand times out of two thousand words. Not so anymore. Keyword stuffing is as likely to get you penalized or in some cases delisted as with thousands of bogus links with no authority linked to your domain. Even those links that may have been piled up and forgotten years ago. Or purchased by the previous owner of the domain (before purchasing domains you have to check for bogus backlinks).

That said, your site is indeed informative and detailed, and I enjoy it very much when I'm parsing through different topics. Your efforts are appreciated!

Link to comment
Share on other sites

2 hours ago, Paul Brancato said:

I asked Google Bard for a bio of Otto Skorzeny and it was woefully and provably inaccurate. I’m not referring to the later details dug up for more recent books, just for basic stuff. 

It's definitely hit and miss and everything has to be checked. It's a useful tool for many things though but it's citations always have to be checked and many times aren't available. I don't know why that is except maybe it has a database or even a repository it can access that is no longer available publicly. Old websites that have been updated for instance. Dunno.

Link to comment
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
×
×
  • Create New...