Spaces:
Runtime error
Runtime error
File size: 2,583 Bytes
485f76b d8acda9 1732876 6120e5b d8acda9 6120e5b d8acda9 6120e5b 1732876 d16b094 d8acda9 e919aa3 1732876 d8acda9 1732876 d8acda9 485f76b 8f69832 1732876 8f69832 26ef429 485f76b d8acda9 485f76b 95698cf d8acda9 95698cf 1732876 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
#+TITLE: Spoof Detect
Detect spoofed website by detecting logos from bank and financial entities in
pages with =ssl certificates= that do not match.
The process is pretty simple:
- [1/2] scrape gvt websites to get a list of entities.
- [x] 🇦🇷 BCRA ok
- [ ] other countries
- [x] get logos, names and url
- [x] navigate the url, extract the ssl certificate and look for =img= and tags
with =id= or =class= logo (needs more heuristics) to make a db of logos
- [x] screenshot the page and slice it into tiles generating YOLO annotations for
the detected logos
- [x] augment data using the logos database and the logoless tiles as background images
- [2/3] train YOLO
- [x] v5
- [x] v6
. [ ] v7 (actually slower than v6)
- [ ] feed everything to a web extension that will detect the logos in any page
and show a warning if the =SSL certificate= mismatches the collected one.
* running
#+begin_src sh
# build the training dataset
docker-compose up --build --remove-orphans -d
docker-compose exec python ./run
# run the training on your machine or collab
# https://colab.research.google.com/drive/10R7uwVJJ1R1k6oTjbkkhxPDka7COK-WE
git clone https://github.com/ultralytics/yolov5 # clone repo
pip install -U -r yolov5/requirements.txt # install dependencies
python3 yolov5/train.py --img 416 --batch 80 --epochs 100 --data ./ia/data.yaml --cfg ./ia/yolov5s.yaml --weights ''
#+end_src
* research
** yolo
https://github.com/ModelDepot/tfjs-yolo-tiny
https://github.com/Hyuto/yolov5-tfjs
** augmentation
there were a lot of augmentation solutions out there, because it had better
piplines and multicore support we went with:
- https://github.com/aleju/imgaug
but leaving the other here for refs
- https://github.com/srp-31/Data-Augmentation-for-Object-Detection-YOLO-
- https://github.com/mdbloice/Augmentor
** proveedores
http://www.bcra.gov.ar/SistemasFinancierosYdePagos/Proveedores-servicios-de-pago-ofrecen-cuentas-de-pago.asp
http://www.bcra.gov.ar/SistemasFinancierosYdePagos/Proveedores-servicios-de-billeteras-digitales-Interoperables.asp
http://www.bcra.gob.ar/SistemasFinancierosYdePagos/Entidades_financieras.asp
** certs in browsers
https://stackoverflow.com/questions/6566545/is-there-any-way-to-access-certificate-information-from-a-chrome-extension
https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webRequest#accessing_security_information
https://chromium-review.googlesource.com/c/chromium/src/+/644858
** papers
https://logomotive.sidnlabs.nl/downloads/LogoMotive_paper.pdf
|