diff --git a/db.py b/db.py index b0e8ecc..1d50c1e 100644 --- a/db.py +++ b/db.py @@ -35,8 +35,6 @@ class Database: which can then be used to obtain record. TODO For faster nearest neighbour lookup we should use something else, e.g. kd-trees - TODO The resulting db is huge (50 MB for 4 pdfs), contains a lot of duplicit uncompressed text. - We should at least de-duplicate the document path. """ vectors: list[Vector] records: dict[VectorBytes, Record]