Perchance tools journal

2024-06-09

Write documentation explaining the process for creating a perchance code from a thesaurus.

2024-06-07

Setup typecheck as part of the default tox tasks.
Add tests if appropriated (digital-notes and perchance-tools)
Review the code and refactor it if appropriated.

2024-06-05

Setup and push a version of digital-notes

2024-06-04

Add cache capabilities in llm-assistant.
- It uses the integrated langchain.community.SQLiteCache
- Add new parameters in the configuration file: use_cache, cache_file
Create test for the custom method of llm-assistant api.
There is a separated cache library: cache3
- I might use this for word-def

2024-05-30

Simplify the WordDict data structure.

The markdown-to-yml creates a list of dicts:

root = [
    {
        "Personnages":
        [
            {
                "Nouns":
                [
                    {
                        "words": []
                    }
                ]
            },
            {
                "Adjectives":
                [
                    ...
                ]
            }
        ]
    },
    {
        "Places":
        [
            ...
        ]
    }
]

This is not necessary. Instead, we can simply have a dictionary of dictionaries.

root = [
    {
        "Personnages":
         {
             "Nouns":
             {
                 "words": []
             },
             "Adjectives":
             {
                 ...
             }
         }
    },
    {
        "Places":
        {
            ...
        }
    }
]

2024-05-05

Setup a cache mechanism to save the responses of gpt4. I should actually implement that in the llm assistant.

I could not continue because I don't have the full list of words in this computer and I can't access giove.

2024-04-22

Simplified the yaml format. The category names are keys on the underlying dictionary.

2024-04-17

Created the perchance-tools repository. It currently has one working command only: markdown-to-yml.

The correct-words command is partially working. I already have the code that traverses the yml file and generates a prompt from it. The next step is to connect with llm-assistant and generate a yml file with the corrections applied.

The next step is code the translate-attributes.

2024-04-10

Decided to first create a tool in which I can create custom prompts to interact with OpenAI. The idea is to use this tool to make the corrections in the list of words.

2024-03-29

Error during creation of all.yml

I noticed that the all.yml file stopped in chronology. That alson explains the unexpected low number of lines (~4K). I need to regenerate everything. That is a good time to stop and think in a better approach for the text correction.

I corrected the creation of all.yml file.

Corrections made by ChatGPT

I used the original prompt of word-guru. This prompt was designed to correct complete texts and not isolated words. It does not work very well because when providing a word that is rarely used although it is written correctly, the LLM tends to give a similar word that it is more popular.

Examples:

donzelle -> demoiselle
jouvenceau -> jeunesse
petiot -> petit
se manier -> se manifester
souquenille -> sacoin

It would be better to pass a context. For example, saying that the given list of words belong to a category (vêtements, adjectives pour décrire la jeunesse...).

There is also a second advantage of doing that: The current prompt sends a lot of context for just a single word. We may get better results and reduce OpenAI consumption by passing the complete list.

2024-03-23

Pipeline

Digitalize thesaurus from a specialized book.
Format the digitization using markdown headers.
Convert the markdown to yml.
Translate yml keys to english. Correct eventual typos of words (in the target language)
Convert to perchance format (also sort alphabetically).