-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor from 0.7.1 to 0.10.0 causing unrecoverable crash #36
Comments
@purew I believe the perf changes you made have this side effect. Can you take a look? |
The |
Hm, I don't quite understand how that could help here, can you elaborate? IMHO a check comparing |
Sure. fn trans(chars: &[u8], mut index: usize) -> Result<(i64, usize), String> {
let mut at_index;
let mut shift = 0;
let mut result = 0;
let mut byte;
loop {
at_index = chars[index];
byte = (at_index as u64).saturating_sub(63);
index += 1;
let to_add = (byte & 0x1f)
.checked_shl(shift)
.ok_or(format!("Couldn't decode character at index {}", index - 1))?;
result |= to_add;
shift += 5;
if byte < 0x20 {
break;
}
}
let coordinate_change = if (result & 1) > 0 {
!(result >> 1)
} else {
result >> 1
} as i64;
Ok((coordinate_change, index))
} However, the performance impact of this change is huge, so it's not feasible: Finished bench [optimized] target(s) in 0.12s
Running benches/benchmarks.rs (target/release/deps/benchmarks-797a7b4ddb4d36ce)
bench encode: 1000 coordinates
time: [77.693 µs 78.226 µs 78.783 µs]
change: [-0.0171% +0.2637% +0.6474%] (p = 0.13 > 0.05)
No change in performance detected.
Found 11 outliers among 100 measurements (11.00%)
2 (2.00%) high mild
9 (9.00%) high severe
bench decode: 21502 coordinates
time: [11.087 ms 11.100 ms 11.114 ms]
change: [+5942.8% +5967.2% +5988.8%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high mild
bench polyline6 decoding
time: [1.4322 µs 1.4341 µs 1.4361 µs]
change: [+2978.0% +2987.4% +2996.4%] (p = 0.00 < 0.05)
Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
8 (8.00%) high mild
5 (5.00%) high severe
bench HUGE polyline6 decoding
time: [1.2993 ms 1.3056 ms 1.3128 ms]
change: [+5460.0% +5504.3% +5549.9%] (p = 0.00 < 0.05)
Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high mild
Please post some code demonstrating your idea so we can test and benchmark it! |
Oh wow, that's quite the impact. Okay, I think I understand a bit better now. I'm not totally up to speed has to how the polyline encoding/decoding works, I will probably do some research on it. I was thinking something like this - it's basically the current code but the I don't have rust environment set up currently, so I wrote this out of thin air: fn trans(chars: &[u8], mut index: usize) -> Result<(i64, usize), String> {
let mut at_index;
let mut shift = 0;
let mut result = 0;
let mut byte;
let mut valid = False;
while index < chars.len() {
at_index = chars[index];
byte = (at_index as u64).saturating_sub(63);
index += 1;
result |= (byte & 0x1f) << shift;
shift += 5;
if byte < 0x20 {
valid = True;
break;
}
}
if not valid {
// this would mean that we have terminated above loop
// but not because of the original condition on "byte < 0x20",
// so this should be treated as an error
Err( ? )
}
let coordinate_change = if (result & 1) > 0 {
!(result >> 1)
} else {
result >> 1
} as i64;
Ok((coordinate_change, index))
} Edit: slightly refactored. |
That doesn't catch the error, unfortunately: fn trans(chars: &[u8], mut index: usize) -> Result<(i64, usize), String> {
let mut at_index;
let mut shift = 0;
let mut result = 0;
let mut byte;
let mut valid = false;
while index < chars.len() {
at_index = chars[index];
byte = (at_index as u64).saturating_sub(63);
index += 1;
result |= (byte & 0x1f) << shift;
shift += 5;
if byte < 0x20 {
valid = true;
break;
}
}
if !valid {
// this would mean that we have terminated above loop
// but not because of the original condition on "byte < 0x20",
// so this should be treated as an error
return Err("Couldn't shift".to_string());
}
let coordinate_change = if (result & 1) > 0 {
!(result >> 1)
} else {
result >> 1
} as i64;
Ok((coordinate_change, index))
} The problem is this line: |
While I'm aware that this does not fix the actual error per-se, it should at least handle the binary crashing violently, right? I understand that it would be desirable to also know the exact character that is invalid though. Currently, the main thing thats bugging me is that certain input to the This is not much different than the "solution" I proposed earlier, but might further clarify the situation: fn trans(chars: &[u8], mut index: usize) -> Result<(i64, usize), String> {
let mut at_index;
let mut shift = 0;
let mut result = 0;
let mut byte;
loop {
if index >= chars.len() {
Err("Cannot decode polyline".to_string())
}
at_index = chars[index];
byte = (at_index as u64).saturating_sub(63);
index += 1;
result |= (byte & 0x1f) << shift;
shift += 5;
if byte < 0x20 {
break;
}
}
let coordinate_change = if (result & 1) > 0 {
!(result >> 1)
} else {
result >> 1
} as i64;
Ok((coordinate_change, index))
} |
Both of your solutions work with release optimisations, and have negligible perf impact. But they don't work with test optimisations (running There's no need to worry about the |
Ah, okay! So if I read you correctly, cargo detects a possible overflow on |
What about something like this? fn trans(chars: &[u8], mut index: usize) -> Result<(i64, usize), String> {
let mut at_index;
let mut shift = 0;
let mut result = 0;
let mut byte;
loop {
at_index = chars[index];
byte = (at_index as u64).saturating_sub(63);
index += 1;
// Protect against overflow by checking if any bits would be moved out of the u64
if ((byte & 0x1f) >> (64 - shift)) != 0 {
Err(format!("Couldn't decode character at index {}", index - 1))
}
result |= (byte & 0x1f) << shift;
shift += 5;
if byte < 0x20 {
break;
}
}
let coordinate_change = if (result & 1) > 0 {
!(result >> 1)
} else {
result >> 1
} as i64;
Ok((coordinate_change, index))
} Edit: I haven't fully read up on the logic of polyline decoding, but probably we just need to check if loop {
at_index = chars[index];
byte = (at_index as u64).saturating_sub(63);
result |= (byte & 0x1f) << shift;
if byte < 0x20 {
break;
}
index += 1;
shift += 5;
if shift > (64 - 5) {
Err(format!("Couldn't decode character at index {}", index - 1))
}
} I have prepared a PR: #37 |
Hi!
First of, thanks a lot for your work on this crate!
I am experiencing a bug when using
georust/polyline
through a python wrapper calledpypolyline
.If that's okay, I will link to the issue that I filed in their repository which contains the necessary information.
To repeat and summarize some of the information I gave on that other issue:
pypolyline
may crash unrecoverablyAny help very much appreciated.
Let me know if you need more information.
Best,
Stefan
The text was updated successfully, but these errors were encountered: