Spaces:
Runtime error
Runtime error
Niv Sardi
commited on
Commit
·
1732876
1
Parent(s):
bbf5506
README: update TODO
Browse files- README.org +13 -5
README.org
CHANGED
@@ -4,14 +4,19 @@ Detect spoofed website by detecting logos from bank and financial entities in
|
|
4 |
pages with =ssl certificates= that do not match.
|
5 |
|
6 |
The process is pretty simple:
|
7 |
-
- [
|
|
|
|
|
8 |
- [x] get logos, names and url
|
9 |
- [x] navigate the url, extract the ssl certificate and look for =img= and tags
|
10 |
with =id= or =class= logo (needs more heuristics) to make a db of logos
|
11 |
- [x] screenshot the page and slice it into tiles generating YOLO annotations for
|
12 |
the detected logos
|
13 |
- [x] augment data using the logos database and the logoless tiles as background images
|
14 |
-
- [
|
|
|
|
|
|
|
15 |
- [ ] feed everything to a web extension that will detect the logos in any page
|
16 |
and show a warning if the =SSL certificate= mismatches the collected one.
|
17 |
|
@@ -20,13 +25,13 @@ The process is pretty simple:
|
|
20 |
# build the training dataset
|
21 |
docker-compose up --build --remove-orphans -d
|
22 |
docker-compose exec python ./run
|
23 |
-
|
24 |
# run the training on your machine or collab
|
25 |
# https://colab.research.google.com/drive/10R7uwVJJ1R1k6oTjbkkhxPDka7COK-WE
|
26 |
git clone https://github.com/ultralytics/yolov5 # clone repo
|
27 |
pip install -U -r yolov5/requirements.txt # install dependencies
|
28 |
python3 yolov5/train.py --img 416 --batch 80 --epochs 100 --data ./ia/data.yaml --cfg ./ia/yolov5s.yaml --weights ''
|
29 |
-
|
30 |
#+end_src
|
31 |
|
32 |
* research
|
@@ -38,7 +43,7 @@ https://github.com/Hyuto/yolov5-tfjs
|
|
38 |
there were a lot of augmentation solutions out there, because it had better
|
39 |
piplines and multicore support we went with:
|
40 |
- https://github.com/aleju/imgaug
|
41 |
-
|
42 |
but leaving the other here for refs
|
43 |
- https://github.com/srp-31/Data-Augmentation-for-Object-Detection-YOLO-
|
44 |
- https://github.com/mdbloice/Augmentor
|
@@ -53,3 +58,6 @@ http://www.bcra.gob.ar/SistemasFinancierosYdePagos/Entidades_financieras.asp
|
|
53 |
https://stackoverflow.com/questions/6566545/is-there-any-way-to-access-certificate-information-from-a-chrome-extension
|
54 |
https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webRequest#accessing_security_information
|
55 |
https://chromium-review.googlesource.com/c/chromium/src/+/644858
|
|
|
|
|
|
|
|
4 |
pages with =ssl certificates= that do not match.
|
5 |
|
6 |
The process is pretty simple:
|
7 |
+
- [1/2] scrape gvt websites to get a list of entities.
|
8 |
+
- [x] 🇦🇷 BCRA ok
|
9 |
+
- [ ] other countries
|
10 |
- [x] get logos, names and url
|
11 |
- [x] navigate the url, extract the ssl certificate and look for =img= and tags
|
12 |
with =id= or =class= logo (needs more heuristics) to make a db of logos
|
13 |
- [x] screenshot the page and slice it into tiles generating YOLO annotations for
|
14 |
the detected logos
|
15 |
- [x] augment data using the logos database and the logoless tiles as background images
|
16 |
+
- [2/3] train YOLO
|
17 |
+
- [x] v5
|
18 |
+
- [x] v6
|
19 |
+
. [ ] v7 (actually slower than v6)
|
20 |
- [ ] feed everything to a web extension that will detect the logos in any page
|
21 |
and show a warning if the =SSL certificate= mismatches the collected one.
|
22 |
|
|
|
25 |
# build the training dataset
|
26 |
docker-compose up --build --remove-orphans -d
|
27 |
docker-compose exec python ./run
|
28 |
+
|
29 |
# run the training on your machine or collab
|
30 |
# https://colab.research.google.com/drive/10R7uwVJJ1R1k6oTjbkkhxPDka7COK-WE
|
31 |
git clone https://github.com/ultralytics/yolov5 # clone repo
|
32 |
pip install -U -r yolov5/requirements.txt # install dependencies
|
33 |
python3 yolov5/train.py --img 416 --batch 80 --epochs 100 --data ./ia/data.yaml --cfg ./ia/yolov5s.yaml --weights ''
|
34 |
+
|
35 |
#+end_src
|
36 |
|
37 |
* research
|
|
|
43 |
there were a lot of augmentation solutions out there, because it had better
|
44 |
piplines and multicore support we went with:
|
45 |
- https://github.com/aleju/imgaug
|
46 |
+
|
47 |
but leaving the other here for refs
|
48 |
- https://github.com/srp-31/Data-Augmentation-for-Object-Detection-YOLO-
|
49 |
- https://github.com/mdbloice/Augmentor
|
|
|
58 |
https://stackoverflow.com/questions/6566545/is-there-any-way-to-access-certificate-information-from-a-chrome-extension
|
59 |
https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webRequest#accessing_security_information
|
60 |
https://chromium-review.googlesource.com/c/chromium/src/+/644858
|
61 |
+
|
62 |
+
** papers
|
63 |
+
https://logomotive.sidnlabs.nl/downloads/LogoMotive_paper.pdf
|