bug-bounty547
xss296
rce205
google184
exploit149
malware138
microsoft133
facebook127
bragging-post120
account-takeover114
cve113
open-source91
privilege-escalation90
csrf82
authentication-bypass75
phishing74
stored-xss72
access-control65
apple64
ai-agents63
writeup62
reflected-xss61
reverse-engineering54
ssrf54
input-validation53
web-security53
browser53
supply-chain51
dos50
sql-injection49
cross-site-scripting48
tool46
smart-contract46
defi45
ethereum45
privacy44
web-application43
cloudflare43
lfi41
web341
information-disclosure39
llm37
responsible-disclosure37
oauth37
ctf36
pentest36
burp-suite35
race-condition35
api-security35
opinion35
0
4/10
This article explores optimizing prefix sum (scan) operations on ARM NEON SIMD instructions, demonstrating how to process multiple integer values in parallel using vector operations and interleaved load/store techniques to achieve speeds up to tens of gigabytes per second compared to scalar loop approaches.
performance-optimization
simd
arm-neon
algorithm-optimization
prefix-sum
vectorization
cpu-optimization
Daniel Lemire
ARM NEON