Download link: https://anonymfile.com/2pJ9n/8ball.zip
8ball is a Python script that depends on:
Yes, it ends with space and a dot. Then press enter.
First load an alphabet file.
Then select a directory containing targets.
Then either create or load a model.
If created, a model must be trained, otherwise it's optional.
If you are just loading and not training the targets do not need the training data.
Trained model can be saved for later use.
Then write your query in the box below "5: Ask the model" button.
Then click that button.
Names of files with training data are irrelevant.
One file = one sample.
--------------------------------------------------------------
IT'S NOT LIKE CHATGPT and that's a good thing.
8ball has more in common with this:
than with chatGPT and that has benefits:
No hallucinations.
Cheap to run.
The program also strips out extra whitespace, unknown characters and interpunction from input text.
That makes the model insensitive to how the text is formatted.
To change the hidden layers and iterations go to newModel() function and edit this line:
mlp = MLPClassifier(hidden_layer_sizes=(100,100,100), max_iter=1000)
Note: if you want to have one layer you need to write it like this: (1234,)
Also I recommend to look at console output if something does not work.
Alphabet is not hardcoded due to high ambitions.
The filtering code could be thrown out and a file without extra characters could be passed.
As neural nets only understand numbers passing a different alphabet and/or target list will give erroneous results.
Biggest future improvement is fuzzing training samples like in that video.
That would allow to just dump say, a sermon, into one file and get lots of samples out of it.
Also maybe pack everything into an sqlite database so the model and its data would be a single file.
But that will (even if just marginally) make it harder to add more targets and training data.
Otherwise it's about making GUI presentable.
-----------------------------------------------------------------------------------------------------
If this screams "my first tkinter/neural net project" then yes, guilty on both counts.
This should have been a project for like 2 weekends given I did most of it this weekend.
As that took way longer due to laziness a more polished version will be released maybe sometime.
It is rough because I wanted to get something, literally anything, working finally out of the door.
But even if I won't end up following through at least maybe the idea presented will stop someone from chasing a chatgpt-flavored pie in the sky.
Also I called an "8ball" instead of "oracle" or anything as pompous to lower expectations.
Machine learning isn't magic, especially with relatively tiny amounts of good data (website, sermons, questions) available.
8ball is a Python script that depends on:
- tkinter: should be prepackaged with Python, otherwise see this SO thread.
- numpy: install with command: pip install numpy
- scikit-learn: install with command: pip install scikit-learn
Yes, it ends with space and a dot. Then press enter.
First load an alphabet file.
Then select a directory containing targets.
Then either create or load a model.
If created, a model must be trained, otherwise it's optional.
If you are just loading and not training the targets do not need the training data.
Trained model can be saved for later use.
Then write your query in the box below "5: Ask the model" button.
Then click that button.
Names of files with training data are irrelevant.
One file = one sample.
--------------------------------------------------------------
IT'S NOT LIKE CHATGPT and that's a good thing.
8ball has more in common with this:
No hallucinations.
It can only point to one of links it is given.
This limits damage it can cause, no glue on pizza.
Worst that can happen is it suggesting a JoS page not fitting with your query.
Cheap to run.
If you used LLMs then you have noticed that the answer does not come in all at once.
That was really visible with old copilot site where words and letters popped in sequentially.
That's the big model being run over and over again (as far as I understand). When asking this kind of model it is only ran once.
On the other hand even in its current state the model is doing something even on a really old laptop i tested it on.
And that old laptop barely felt it.
That means a decently sized model could be trained and ran on a modern CPU.
This means no finicky GPU setups.
The program also strips out extra whitespace, unknown characters and interpunction from input text.
That makes the model insensitive to how the text is formatted.
To change the hidden layers and iterations go to newModel() function and edit this line:
mlp = MLPClassifier(hidden_layer_sizes=(100,100,100), max_iter=1000)
Note: if you want to have one layer you need to write it like this: (1234,)
Also I recommend to look at console output if something does not work.
Alphabet is not hardcoded due to high ambitions.
The filtering code could be thrown out and a file without extra characters could be passed.
As neural nets only understand numbers passing a different alphabet and/or target list will give erroneous results.
Biggest future improvement is fuzzing training samples like in that video.
That would allow to just dump say, a sermon, into one file and get lots of samples out of it.
Also maybe pack everything into an sqlite database so the model and its data would be a single file.
But that will (even if just marginally) make it harder to add more targets and training data.
Otherwise it's about making GUI presentable.
-----------------------------------------------------------------------------------------------------
If this screams "my first tkinter/neural net project" then yes, guilty on both counts.
This should have been a project for like 2 weekends given I did most of it this weekend.
As that took way longer due to laziness a more polished version will be released maybe sometime.
It is rough because I wanted to get something, literally anything, working finally out of the door.
But even if I won't end up following through at least maybe the idea presented will stop someone from chasing a chatgpt-flavored pie in the sky.
Also I called an "8ball" instead of "oracle" or anything as pompous to lower expectations.
Machine learning isn't magic, especially with relatively tiny amounts of good data (website, sermons, questions) available.