Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Intro to Rust (with speaker notes)

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for dbrgn dbrgn
September 29, 2015

Intro to Rust (with speaker notes)

This is the same presentation as here https://speakerdeck.com/dbrgn/intro-to-rust, but with speaker notes included.

Avatar for dbrgn

dbrgn

September 29, 2015
Tweet

More Decks by dbrgn

Other Decks in Technology

Transcript

  1. 2 Agenda 1. What is Rust? 2. What’s Type Safety?

    3. Reading Rust 4. Memory Safety in Rust 5. Multithreaded Programming 6. Further Reading 7. Questions
  2. 4 «Rust is a systems programming language that runs blazingly

    fast, prevents nearly all segfaults, and guarantees thread safety.» www.rust-lang.org
  3. 5 What’s wrong with systems languages? - It’s difficult to

    write secure code. - It’s very difficult to write multithreaded code. These are the problems Rust was made to address. 1 - Systems languages have come a long way in the last 50 years - First virus based on a buffer overflow appeared in 1988 - According to Open Source Vulnerability Database, still 10-15% of reported vulns during last 8 years are buffer overflows 2 - Multithreading is becoming more needed with multicore CPUs - C++ like threading is incredibly hard, even experienced programmers write hard to reproduce bugs
  4. 6 Quick Facts about Rust (As of September 2015) -

    Started by Mozilla employee Graydon Hoare - First announced by Mozilla in 2010 - Community driven development - First stable release: 1.0 in May 2015 - Latest stable release: 1.3 - 46'484 commits on Github - Largest project written in Rust: Servo
  5. 7 Features - Zero-cost abstractions - Move semantics - Guaranteed

    memory safety - Threads without data races - Trait based generics - Pattern matching - Type inference - Minimal runtime, no GC - Efficient C bindings
  6. 9 A C Program int main(int argc, char **argv) {

    unsigned long a[1]; a[3] = 0x7ffff7b36cebUL; return 0; } According to C99, undefined behavior. Output: undef: Error: .netrc file is readable by others. undef: Remove password or make file unreadable by others. 1 - We’re overwriting the return address on the stack. - Jumping right into libc 2 - The user is responsible for safety - We’re not good at that
  7. 10 Definitions - If a program has been written so

    that no possible execution can exhibit undefined behavior, we say that program is well defined. - If a language’s type system ensures that every program is well defined, we say that language is type safe
  8. 11 Type Safe Languages - C and C++ are not

    type safe. - Python is type safe: >> a = [0] >>> a[3] = 0x7ffff7b36ceb Traceback (most recent call last): File "", line 1, in <module> IndexError: list assignment index out of range >>> - Java, JavaScript, Ruby, and Haskell are also type safe. 1 - Our sample program had no type erorrs, yet exhibits undefined behavior. 2 - An exception is no undefined behavior. 3 - Every program that type safe languages accept is well defined.
  9. 12 It’s ironic. C and C++ are not type safe.

    Yet they are being used to implement the foundations of a system. Rust tries to resolve that tension. - Rust also allows unsafe code. But the great majority of programs does not need unsafe code.
  10. 14 Example 1 fn gcd(mut n: u64, mut m: u64)

    -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } - Program that calculates the greatest common denominator
  11. 15 Example 1 fn gcd(mut n: u64, mut m: u64)

    -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } The fn keyword introduces a function definition. Arrow denotes return value.
  12. 16 Example 1 fn gcd(mut n: u64, mut m: u64)

    -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } Variables are immutable by default
  13. 17 Example 1 fn gcd(mut n: u64, mut m: u64)

    -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } - Rust variable declarations: Name followed by type - [uif](8|16|32|64) and usize / isize
  14. 18 Example 1 fn gcd(mut n: u64, mut m: u64)

    -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } - Macro invocations with exclamation mark
  15. 19 Example 1 fn gcd(mut n: u64, mut m: u64)

    -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } - Type is inferred from context or suffix. Otherwise i32.
  16. 20 Example 1 fn gcd(mut n: u64, mut m: u64)

    -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } - let introduces a local variable - Type is inferred
  17. 21 Example 1 fn gcd(mut n: u64, mut m: u64)

    -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } - Loops and conditions don’t need parentheses, only braces around body
  18. 22 Example 1 fn gcd(mut n: u64, mut m: u64)

    -> u64 { assert!(n != 0 && m != 0); while m != 0 { if m < n { let t = m; m = n; n = t; } m = m % n; } n } - Blocks are expressions - return statement optional
  19. 23 Example 2: Generics fn min<T: Ord>(a: T, b: T)

    -> T { if a <= b { a } else { b } } - Program that returns the smaller of two values
  20. 24 Example 2: Generics fn min<T: Ord>(a: T, b: T)

    -> T { if a <= b { a } else { b } } - Text marks it as generic function - Defined for any type T where T is Ord - If a type is Ord, it supports a comparison - Ord is a Trait, will be handled later
  21. 25 Example 2: Generics fn min<T: Ord>(a: T, b: T)

    -> T { if a <= b { a } else { b } } ... min(10i8, 20) == 10; // T is i8 min(10, 20u32) == 10; // T is u32 min(“abc”, “xyz”) == “abc”; // Strings are Ord min(10i32, “xyz”); // error: mismatched types. -
  22. 26 Example 3: Generic Types struct Range<Idx> { start: Idx,

    end: Idx, } ... Range { start: 200, end: 800 } // OK Range { start: 1.3, end: 4.7 } // Also OK - A struct holds data - Can be generic
  23. 27 Example 4: Enumerations enum Option<T> { Some(T), None }

    - Algebraic Datatypes - For any type T, an Option<T> may be either: - None, which carries no value - Some(v) which carries the value v of type T - Resemble unions in C / C++, but remember their value type
  24. 28 Example 5: Application of Option<T> fn safe_div(n: i32, d:

    i32) -> Option<i32> { if d == 0 { return None; } Some(n / d) } - Function that returns a division result, or None if divisor is 0 - (Could also be written as a single expression)
  25. 29 Example 6: Matching an Option match safe_div(num, denom) {

    None => println!(“No quotient.”), Some(v) => println!(“Quotient is {}.”, v) } - Similar to a switch statement, but more powerful - Rust offers combinator methods for simplification, more on this later
  26. 30 Example 7: Traits trait HasArea { fn area(&self) ->

    f64; } - Any type implementing HasArea must provide a method called “area” that takes no parameters and returns an f64
  27. 31 Example 8: Trait Implementation struct Circle { x: f64,

    y: f64, radius: f64, } impl HasArea for Circle { fn area(&self) -> f64 { consts::PI * (self.radius * self.radius) } } - Traits are implemented for structs - Other way of looking at it: Methods (behavior) are attached to data
  28. 32 Example 9: Default Methods trait Validatable { fn is_valid(&self)

    -> bool; fn is_invalid(&self) -> bool { !self.is_valid() } } - A trait can provide default implementations - Can be overridden
  29. 33 Example 10: Trait Composition trait Foo { fn foo(&self);

    } trait FooBar : Foo { fn foobar(&self); } - Any trait implementing FooBar must also implement Foo - Example: A trait Number that requires to implement Add, Sub, Mul, Div
  30. 35 Three Key Promises - No null pointer dereferences -

    No dangling pointers - No buffer overruns 1 - Your program will not crash because you tried to dereference a null pointer 2 - Every value will live as long as it must 3 - Your program will never access elements outside of an array All ensured at compile time. What do they mean?
  31. 36 P1: No null pointer dereferences - Null pointers are

    useful - They can indicate the absence of optional information - They can indicate failures - But they can introduce severe bugs - Rust separates the concept of a pointer from the concept of an optional or error value - Optional values are handled by Option<T> - Error values are handled by Result<T, E> - Many helpful tools to do error handling
  32. 37 You already saw Option<T> fn safe_div(n: i32, d: i32)

    -> Option<i32> { if d == 0 { return None; } Some(n / d) } - But what if you want to return an error, not just None?
  33. 38 There’s also Result<T, E> enum Result<T, E> { Ok(T),

    Err(E) } - E can be any type, even String
  34. 39 How to use Results enum Error { DivisionByZero, }

    fn safe_div(n: i32, d: i32) -> Result<i32, Error> { if d == 0 { return Err(Error::DivisionByZero); } Ok(n / d) } - Good practice to define your own error types instead of using strings
  35. 40 Tedious Results fn do_calc() -> Result<i32, String> { let

    a = match do_subcalc1() { Ok(val) => val, Err(msg) => return Err(msg), } let b = match do_subcalc2() { Ok(val) => val, Err(msg) => return Err(msg), } Ok(a + b) } - Calling a lot of functions returning a result can become tedious
  36. 41 The try! Macro fn do_calc() -> Result<i32, String> {

    let a = try!(do_subcalc1()); let b = try!(do_subcalc2()); Ok(a + b) } - The try! macro does the same thing, unwrap or early return - Error signature must match! - What if the errors don’t match?
  37. 42 Mapping Errors fn do_subcalc() -> Result<i32, String> { …

    } fn do_calc() -> Result<i32, Error> { let res = do_subcalc(); let mapped = res.map_err(|msg| { println!(“Error: {}”, msg); Error::CalcFailed }); let val = try!(mapped); Ok(val + 1) } - Convert them with helper methods - map_err passes through a successful result while handling an error - Explanation on next slide
  38. 43 Mapping Errors let mapped = res.map_err(|msg| Error::CalcFailed); is the

    same as let mapped = match res { Ok(val) => Ok(val), Err(msg) => Err(Error::CalcFailed), } - Many other helper methods like this
  39. 44 Other Combinator Methods (1) Get the value from an

    option. Option.unwrap(self) -> T Option.unwrap_or(self, def: T) -> T Option.unwrap_or_else<F>(self, f: F) -> T where F: FnOnce() -> T
  40. 45 Other Combinator Methods (2) Map an Option<T> to Option<U>

    or U. Option.map<U, F>(self, f: F) -> Option<U> where F: FnOnce(T) -> U Option.map_or<U, F>(self, default: U, f: F) -> U where F: FnOnce(T) -> U Option.map_or_else<U, D, F>(self, default: D, f: F) -> U where F: FnOnce(T) -> U, D: FnOnce() -> U
  41. 46 Other Combinator Methods (3) Convert an option to a

    result, mapping Some(v) to Ok(v) and None to Err(err). Option.ok_or<E>(self, err: E) -> Result<T, E> Option.ok_or_else<E, F>(self, err: F) -> Result<T, E> where F: FnOnce() -> E - This is only a small selection. - There are similar methods on Result and others. - They all make it easier to work without having null pointers.
  42. 47 P2: No dangling pointers - Rust programs never try

    to access a heap-allocated value after it has been freed. - No garbage collection or reference counting involved! - Everything is enforced at compile time. 1 - Not an unusual promise, all type safe languages do this 2 3 - How is this done?
  43. 48 Three Rules - Rule 1: Every value has a

    single owner at any given time. - Rule 2: You can borrow a reference to a value, so long as the reference doesn’t outlive the value. - Rule 3: You can only modify a value when you have exclusive access to it. - 1: You can move a value from one owner to another, but when a value’s owner goas away the value is freed along with it. - 2: Borrowed references are temporary pointers; they allow you to operate on values you don’t own.
  44. 49 Ownership - Variables own their values - A struct

    owns its fields - An enum owns its values - Every heap-allocated value has a single pointer that owns it - All values are dropped when their owner is dropped -
  45. 50 Ownership: Scoping { let s = “Chuchichästli”.to_string(); } //

    s goes out of scope, text is freed - Variables that go out of scope are freed
  46. 51 Ownership: Move Semantics { let s = “Chuchichästli”.to_string(); //

    t1 takes ownership from s let t1 = s; // compile-time error: use of moved value s let t2 = s; } - Assigning to variables moves values (most of times)
  47. 52 Ownership: Copy Trait { let pi = 3.1415926f32; let

    foo = pi; let bar = pi; // This is fine! } - Types that implement the “Copy” trait (usually primitive types) are copied implicitly - Examples: char, bool, numeric types
  48. 53 Ownership: Clone Trait { let s = “Chuchichästli”.to_string(); let

    t1 = s.clone(); let t2 = s.clone(); } - Other types can implement Clone trait for explicit cloning - Three independent String objects - Each is owned by the variable binding
  49. 54 Ownership: Deriving Copy / Clone #[derive(Copy, Clone)] struct Color

    { r: u8, g: u8, b: u8 } - Implementing Copy and Clone is trivial for most types - So it can be auto-generated by the compiler - All values must be Copy / Clone too
  50. 55 But what about this? let s = “Hello, world”.to_string();

    print_with_umpff(s); println!(“{}”, s); error: use of moved value: `s` println!(“{}”, s); ^ note: `s` moved here because it has type `collections::string:: String`, which is non-copyable print_with_umpff(s); ^ - Now you know move semantics - Can cause problems though. - Does someone see the problem? - Ownership is moved into function and freed when function returns
  51. 56 Borrowing let s = “Hello, world”.to_string(); print_with_umpff(&s); println!(“Original value

    was {}”, s); - The function can borrow the value - Many functions can borrow at the same time, because they cannot modify
  52. 57 Mutable Borrowing let mut s = “Hello, world”.to_string(); add_umpff(&mut

    s); println!(“New value is {}”, s); - A mutable borrow grants exclusive access - Only one mutable borrow possible at a time - While you borrow a mutable reference to a value, that reference is the only way to access that value at all.
  53. 58 Borrowing prevents moving let x = String::new(); let borrow

    = &x; let y = x; // error: cannot move out of `x` because // it is borrowed - While borrowed, a move must be prevented - Otherwise you might end up with a dangling pointer
  54. 59 Lifetimes let borrow; let x = String::new(); borrow =

    &x; // error: `x` does not live // long enough - What is the problem here? - Lifetime of borrow is longer than lifetime of x - This can also be visualized differently:
  55. 60 Lifetimes { let borrow; { let x = String::new();

    borrow = &x; // error: `x` does not live // long enough } } - Now it should be obvious. - Using lifetime checking, the compiler guarantees that there are no dangling pointers.
  56. 61 Lifetimes - Sometimes the compiler is wrong about automatically

    inferred lifetimes - He needs more knowledge - Parameters and return values can be annotated with explicit lifetimes - Won’t be covered here :) -
  57. 62 P3: No buffer overruns - There’s no pointer arithmetic

    in Rust - Arrays in Rust are not just pointers - Bounds checks, usually at compile time (zero cost abstractions)
  58. 64 We’ll make this short - The Rust compiler does

    not know about concurrency - Everything works based on the three rules - I’ll only show a few examples
  59. 65 Threads let t1 = std::thread::spawn(|| { return 23; });

    let t2 = std::thread::spawn(|| { return 19; }); let v1 = try!(t1.join()); let v2 = try!(t2.join()); println!(“{} + {} = {}”, v1, v2, v1 + v2); - Simple example - No shared data
  60. 66 Mutexes / Arcs (1) let data = Arc::new(Mutex::new(0)); let

    data1 = data.clone(); let t1 = thread::spawn(move || { let mut guard = data1.lock().unwrap(); *guard += 19; }); let data2 = data.clone(); let t2 = thread::spawn(move || { let mut guard = data2.lock().unwrap(); *guard += 23; }); - Arc allow multiple references to the same data. (Safe pointers.) Arcs can be cloned. - Value of an Arc gets dropped when references are 0. - Mutexes own their values. Using lock() acquires mutex. - Locking returns a MutexGuard as proxy. When guard is dropped, lock is released. - Not posible to forget about releasing. - Arc pointer moved into closures
  61. 67 Mutexes / Arcs (2) t1.join().unwrap(); t2.join().unwrap(); let guard =

    data.lock().unwrap(); assert_eq!(*guard, 42); - Threads need to be joined, otherwise result might not yet be ready. - Again, we need to acquire a Mutex lock. - We don’t need another Arc reference though.
  62. 68 Channels use std::sync::mpsc::channel; Signature: fn channel<T>() -> (Sender<T>, Receiver<T>)

    - mpsc: Multiple producers, single consumer - Unidirectional channel - Returns two ends: sender and receiver - Sender can be cloned, but not receiver
  63. 70 «Why Rust?» Free e-book by O’Reilly, ~50 pages. Highly

    recommended! This presentation is actually based on that book. http://www.oreilly.com/programming/free/why-rust.csp
  64. 71 «Rust Book» Not actually a book. Official guide to

    learning Rust. Great resource. https://doc.rust-lang.org/book/