Difference between revisions of "Private:progress-alkurbi"
From NMSL
Line 9: | Line 9: | ||
* All figures on the report are on eps format. | * All figures on the report are on eps format. | ||
* I implemented and tested "`Tracking IP-flows"' instead of SIP sessions (To avoid the need to the application layer content which could be encrypted). I ran number of experiments, which showed unstable precision (low, mild, high, very high FP/FN), so I discard it. | * I implemented and tested "`Tracking IP-flows"' instead of SIP sessions (To avoid the need to the application layer content which could be encrypted). I ran number of experiments, which showed unstable precision (low, mild, high, very high FP/FN), so I discard it. | ||
− | * In order to improve the performance of calculating the similarity between two users from $O(N^2)$ into O(N), I redesigned the correlation\&detection engine to insert sessions | + | * In order to improve the performance of calculating the similarity between two users from $O(N^2)$ into O(N), I redesigned the correlation\&detection engine to insert sessions in the order of their length, then implemented 3 types of similarity/distance algorithms: Cosine Similarity, Normalized Euclidean distance, and Canberra distance. Sessions are inserted in order so that we can eliminate the session time factor. The experiments showed that I could not find a threshold value that would minimize both FP/FN to the minimum. |
* I almost implement incremental processing mode to improve the performance, so instead of processing the whole data within a time window, over again at every sliding window, I tried to take advantage of what has been built from the previous time window, and integrate only those new users/sessions in the new sliding time window into it. Next is to test the code, and I'm sure that it will need quite refinements. | * I almost implement incremental processing mode to improve the performance, so instead of processing the whole data within a time window, over again at every sliding window, I tried to take advantage of what has been built from the previous time window, and integrate only those new users/sessions in the new sliding time window into it. Next is to test the code, and I'm sure that it will need quite refinements. | ||
Revision as of 11:44, 14 February 2011
Spring 2011
- Courses: None
- Research: Developing Online SIP-Botnet Detection System
- Progress Report: Please read "Progress" section here
Feb 15
- All figures on the report are on eps format.
- I implemented and tested "`Tracking IP-flows"' instead of SIP sessions (To avoid the need to the application layer content which could be encrypted). I ran number of experiments, which showed unstable precision (low, mild, high, very high FP/FN), so I discard it.
- In order to improve the performance of calculating the similarity between two users from $O(N^2)$ into O(N), I redesigned the correlation\&detection engine to insert sessions in the order of their length, then implemented 3 types of similarity/distance algorithms: Cosine Similarity, Normalized Euclidean distance, and Canberra distance. Sessions are inserted in order so that we can eliminate the session time factor. The experiments showed that I could not find a threshold value that would minimize both FP/FN to the minimum.
- I almost implement incremental processing mode to improve the performance, so instead of processing the whole data within a time window, over again at every sliding window, I tried to take advantage of what has been built from the previous time window, and integrate only those new users/sessions in the new sliding time window into it. Next is to test the code, and I'm sure that it will need quite refinements.
Feb 08
- Rewrite the experiments and evaluation results in a formal manner under "`Experimental Evaluation"' chapter. (Done)
- Implementing \& Testing Identifying Sip-Botnet controllers. (Done)
- Implementing and testing Online Mode. (Done)
Feb 01
- Evaluation according to the plan (Large Scale Evaluation & Documentation) is complete, as following:
- Generated Traffics have been checked.
- Alpha & Beta has been tuned.
- Different traffic have been generated for different number of bots [10, 50, 100].
- FP/FN has been calculated for different Win [1h, 2h, 3h], and for different Sliding-Win [5m, 10m, 15m, 20m, 25m, 30m], with different number of bots.
- Average running time has been computed for different Win [1h, 2h, 3h] and different number of bots [10, 50, 100].
- A total of 34 figures have been plotted and included in the report.
- The attached report has all the update.
Jan 24
- Works (Large Scale Evaluation & Documentation):
- Generated 24h SIP traffic with "1000" users, "10" bots.
- Tuned Alpha & Beta values.
- Ran the proposed system against the generated traffic with different win sizes (3h, 2h), and different Sliding-Win sizes (5m, 10m, 15m, 20m, 25m, 30m), to calculate False Positives/Negatives, and generated 12 statistics reports.
- Exporting statistics reports into Matlab and generating figures.