[idea] using Parseq, using the idea of Scene Text recongition

No description provided.

todo for myself:
get a dataset of the new captchas
https://github.com/baudm/parseq
use crnn because of low cpu usage, so can be attached to an API
use an API with a rate limited of 5 requests per minute.

I dont really know shit about AI and all that stuff but one idea I had was combining this with the previous dataset to generate a dataset of the new captchas but idk maybe that is retarded

Current model uses similar crrn architecture, but even smaller - you don't need much for captchas of 20 characters with zero semantic. You can check the training notebook I shared in #6 for more details.
You can run this model as an API, moffatman does that for chance iirc

@drunohazarb

Time for @drunohazarb to update the script then.

Current model uses similar crrn architecture, but even smaller - you don't need much for captchas of 20 characters with zero semantic. You can check the training notebook I shared in #6 for more details. You can run this model as an API, moffatman does that for chance iirc

Why did the development on the script stop?
Do you have the new model?
Any news from moffatman?
How come nobody cares anymore?

Why did the development on the script stop?
Do you have the new model?
Any news from moffatman?
How come nobody cares anymore?

I'm not the script developer
What's wrong with current one?
Ask him?
Dunno, current script works fine for me

What's wrong with current one?

Doesn't work when the white letters are on the black blob, it doesn't solve those letters/numbers.

@yukariin

@yukariin how many are needed to get 75% success rate?

1k should be fine for fine-tuning.
You also want an even distribution between different captcha types. For example if you have 1k new captchas (white letter in black circle) you want 1k old captchas (without circle) and 1k new-old captchas (black letter in black circle with white outline) so 3k total dataset.

1k old captchas (without circle) and 1k new-old captchas (black letter in black circle with white outline) so 3k total dataset.

The older captchas are the same.

The older captchas are the same.

Yeah but you don't want model to regress on old captchas, which will happen if you train/tune only on new ones

here's what ive got

Here's a test model actives 98.9% (7655/7735) accuracy on combined large dataset (10k old + 16k new + 1k new (white letters))

Why did the development on the script stop?
Do you have the new model?
Any news from moffatman?
How come nobody cares anymore?

It's not that I don't care, there's just not much I can do without collecting captcha samples myself.
Anyways, I just updated the script to use Yukariin's new test model. I will push the update as soon as I can confirm that the model can solve the older captchas as well, for good measure.

As for the suggestion, I think it belongs here: https://github.com/based-org/chana-solver

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[idea] using Parseq, using the idea of Scene Text recongition #13

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Participants

[idea] using Parseq, using the idea of Scene Text recongition #13

Description

Activity

beansofhell commented on Jan 7, 2024

slabodan commented on Jan 7, 2024

yukariin commented on Jan 10, 2024

JonseyJones commented on Jan 11, 2024

JonseyJones commented on Jan 11, 2024

yukariin commented on Jan 11, 2024

JonseyJones commented on Jan 11, 2024

yukariin commented on Jan 11, 2024

JonseyJones commented on Jan 11, 2024

yukariin commented on Jan 12, 2024

yukariin commented on Jan 12, 2024

drunohazarb commented on Jan 12, 2024

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Participants

Issue actions