Niv Sardi commited on
Commit
1732876
·
1 Parent(s): bbf5506

README: update TODO

Browse files
Files changed (1) hide show
  1. README.org +13 -5
README.org CHANGED
@@ -4,14 +4,19 @@ Detect spoofed website by detecting logos from bank and financial entities in
4
  pages with =ssl certificates= that do not match.
5
 
6
  The process is pretty simple:
7
- - [x] scrape gvt websites to get a list of entities (for argentina it's BCRA)
 
 
8
  - [x] get logos, names and url
9
  - [x] navigate the url, extract the ssl certificate and look for =img= and tags
10
  with =id= or =class= logo (needs more heuristics) to make a db of logos
11
  - [x] screenshot the page and slice it into tiles generating YOLO annotations for
12
  the detected logos
13
  - [x] augment data using the logos database and the logoless tiles as background images
14
- - [x] train yolov5s
 
 
 
15
  - [ ] feed everything to a web extension that will detect the logos in any page
16
  and show a warning if the =SSL certificate= mismatches the collected one.
17
 
@@ -20,13 +25,13 @@ The process is pretty simple:
20
  # build the training dataset
21
  docker-compose up --build --remove-orphans -d
22
  docker-compose exec python ./run
23
-
24
  # run the training on your machine or collab
25
  # https://colab.research.google.com/drive/10R7uwVJJ1R1k6oTjbkkhxPDka7COK-WE
26
  git clone https://github.com/ultralytics/yolov5 # clone repo
27
  pip install -U -r yolov5/requirements.txt # install dependencies
28
  python3 yolov5/train.py --img 416 --batch 80 --epochs 100 --data ./ia/data.yaml --cfg ./ia/yolov5s.yaml --weights ''
29
-
30
  #+end_src
31
 
32
  * research
@@ -38,7 +43,7 @@ https://github.com/Hyuto/yolov5-tfjs
38
  there were a lot of augmentation solutions out there, because it had better
39
  piplines and multicore support we went with:
40
  - https://github.com/aleju/imgaug
41
-
42
  but leaving the other here for refs
43
  - https://github.com/srp-31/Data-Augmentation-for-Object-Detection-YOLO-
44
  - https://github.com/mdbloice/Augmentor
@@ -53,3 +58,6 @@ http://www.bcra.gob.ar/SistemasFinancierosYdePagos/Entidades_financieras.asp
53
  https://stackoverflow.com/questions/6566545/is-there-any-way-to-access-certificate-information-from-a-chrome-extension
54
  https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webRequest#accessing_security_information
55
  https://chromium-review.googlesource.com/c/chromium/src/+/644858
 
 
 
 
4
  pages with =ssl certificates= that do not match.
5
 
6
  The process is pretty simple:
7
+ - [1/2] scrape gvt websites to get a list of entities.
8
+ - [x] 🇦🇷 BCRA ok
9
+ - [ ] other countries
10
  - [x] get logos, names and url
11
  - [x] navigate the url, extract the ssl certificate and look for =img= and tags
12
  with =id= or =class= logo (needs more heuristics) to make a db of logos
13
  - [x] screenshot the page and slice it into tiles generating YOLO annotations for
14
  the detected logos
15
  - [x] augment data using the logos database and the logoless tiles as background images
16
+ - [2/3] train YOLO
17
+ - [x] v5
18
+ - [x] v6
19
+ . [ ] v7 (actually slower than v6)
20
  - [ ] feed everything to a web extension that will detect the logos in any page
21
  and show a warning if the =SSL certificate= mismatches the collected one.
22
 
 
25
  # build the training dataset
26
  docker-compose up --build --remove-orphans -d
27
  docker-compose exec python ./run
28
+
29
  # run the training on your machine or collab
30
  # https://colab.research.google.com/drive/10R7uwVJJ1R1k6oTjbkkhxPDka7COK-WE
31
  git clone https://github.com/ultralytics/yolov5 # clone repo
32
  pip install -U -r yolov5/requirements.txt # install dependencies
33
  python3 yolov5/train.py --img 416 --batch 80 --epochs 100 --data ./ia/data.yaml --cfg ./ia/yolov5s.yaml --weights ''
34
+
35
  #+end_src
36
 
37
  * research
 
43
  there were a lot of augmentation solutions out there, because it had better
44
  piplines and multicore support we went with:
45
  - https://github.com/aleju/imgaug
46
+
47
  but leaving the other here for refs
48
  - https://github.com/srp-31/Data-Augmentation-for-Object-Detection-YOLO-
49
  - https://github.com/mdbloice/Augmentor
 
58
  https://stackoverflow.com/questions/6566545/is-there-any-way-to-access-certificate-information-from-a-chrome-extension
59
  https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webRequest#accessing_security_information
60
  https://chromium-review.googlesource.com/c/chromium/src/+/644858
61
+
62
+ ** papers
63
+ https://logomotive.sidnlabs.nl/downloads/LogoMotive_paper.pdf