Rust com Exemplos

Rust é uma linguagem de programação de sistemas moderna focada na segurança, velocidade e concorrência. Ela cumpre estes objetivos sendo segura quanto ao uso da memória sem usar recolha de lixo.

Rust com Exemplos é uma coleção de exemplos executáveis que ilustram vários conceitos da Rust e bibliotecas padrão. Para obter ainda mais destes exemplos, não te esqueças de instalar a Rust localmente e consulte a documentação oficial. Adicionalmente para os curiosos, também podes consultar o código-fonte para este local.

Agora começaremos!

Olá Mundo

Este é o código-fonte do programa "Hello World" tradicional:

// Isto é um comentário, e é ignorado pelo compilador.
// Podemos testar este código clicando o botão "Run" ou
// se preferirmos usar o nosso teclado, podemos usar o atalho "Ctrl + Enter"

// Este código é editável, sinta-se a vontade para alterá-lo!
// Podemos sempre retornar para o código original clicando o botão "Reset" ->

// Esta é a função principal.
fn main() {
    // Declarações nesta função são executadas quando o binário for chamado.

    // Imprime o texto na consola.
    println!("Hello World!");
}

println! é uma macro que imprime o texto na consola.

Um binário pode ser gerado usando o compilador da Rust: rustc:

$ rustc hello.rs

rustc produzirá um binário hello que pode ser executado:

$ ./hello
Hello World!

Atividade

Clique sobre o 'Run' acima para veres a saída esperada. Depois, adicione uma nova linha com uma segunda macro println! para que a saída exiba:

Hello World!
I'm a Rustacean!

Comentários

Qualquer programa requer comentários, e a Rust suporta algumas variedades diferentes:

  • Comentários normais que são ignorados pelo compilador:
    • // Comentários de linha que vão até ao final da linha.
    • /* Comentários de bloco que vão até o delimitador de fechamento. */
  • Comentários de documentação que são analisados para documentação da biblioteca em HTML:
    • /// Gera documentações da biblioteca para o seguinte item.
    • //! Gera as documentações da biblioteca para o item de fechamento.
fn main() {
    // Isto é um exemplo de comentário duma linha.
    // Existem duas barras no princípio da linha.
    // E nada escrito depois destas será lido pelo compilador.

    // println!("Hello, world!");

    // Execute-o. Vês? Agora tente eliminar as duas barras, e execute-o novamente.

    /*
     * Isto é um outro tipo de comentário, um bloco de comentário. Em geral,
     * comentários de linha são o estilo de comentário recomendado.
     * Mas os comentários de bloco são extremamente úteis para
     * desativar temporariamente pedaços de código.
     * /* Comentários de bloco podem ser /* encaixados, */ */ então apenas custa
     * alguns toques na tecla para comentar tudo nesta função `main()`.
     * /*/*/* Experimente-o tu mesmo! */*/*/
     */

    /*
    Nota: a anterior coluna de `*` foi inteiramente para estilo. Não existe
    de fato necessidade disto.
    */

    // Nós podemos manipular as expressões mais facilmente com comentários
    // de bloco do que com comentários de linha.
    // Tente eliminar os delimitadores de comentário para mudar o resultado:
    let x = 5 + /* 90 + */ 5;
    println!("Is `x` 10 or 100? x = {}", x);
}

Consulte também:

Documentação de biblioteca

Impressão Formatada

A impressão é manipulada por uma série de macros definidas na std::fmt, algumas das quais incluem:

  • format!: escreve o texto formatado para a String.
  • print!: o mesmo que format! mas o texto é imprimido para a consola (io::stdout).
  • println!: o mesmo que print! mas uma nova linha é anexada.
  • eprint!: o mesmo que print! mas o texto é imprimido para o erro padrão (io::stderr).
  • eprintln!: o mesmo que eprint! mas uma nova linha é anexada.

Todos analisam o texto da mesma maneira. Como uma vantagem, a Rust verifica a exatidão da formatação em tempo de compilação:

fn main() {
    // Em geral, as `{}` serão automaticamente substituídas por quaisquer
    // argumentos. Estes serão convertidos em sequências de caracteres.
    println!("{} days", 31);

    // Os argumentos posicionais podem ser usados. Especificar um inteiro
    // dentro de `{}` determina qual argumento adicional será substituído.
    // Os argumentos começam em 0 imediatamente depois da 
    // sequência de caracteres de formatação.
    println!("{0}, this is {1}. {1}, this is {0}", "Alice", "Bob");

    // As can named arguments.
    println!("{subject} {verb} {object}",
             object="the lazy dog",
             subject="the quick brown fox",
             verb="jumps over");

    // Uma formatação diferente pode ser invocada com a especificação do
    // carácter de formato depois dum `:`.
    println!("Base 10:               {}",   69420); // 69420
    println!("Base 2 (binary):       {:b}", 69420); // 10000111100101100
    println!("Base 8 (octal):        {:o}", 69420); // 207454
    println!("Base 16 (hexadecimal): {:x}", 69420); // 10f2c
    println!("Base 16 (hexadecimal): {:X}", 69420); // 10F2C

    // Podemos justificar o texto à direita com uma largura especificada.
    // Isto resultará em "    1".
    // (Quatro espaços em branco e um "1", para uma largura total de 5).
    println!("{number:>5}", number=1);

    // Podemos acolchoar números com zeros adicionais,
    println!("{number:0>5}", number=1); // 00001
    // e ajustar à esquerda com a virada do sinal. Isto resultará em "10000".
    println!("{number:0<5}", number=1); // 10000

    // Podemos usar argumentos nomeados no 
    // especificador de formato anexando um `$`.
    println!("{number:0>width$}", number=1, width=5);

    // A Rust ainda verifica para certificar-se 
    // que o número correto de argumentos são usados.
    println!("My name is {0}, {1} {0}", "Bond");
    // FIXME ^ Add the missing argument: "James"

    // Apenas tipos que implementam `fmt::Display` podem ser formatados com `{}`.
    // Os tipos definidos pelo utilizador 
    // não implementam `fmt::Display` por padrão.

    #[allow(dead_code)] // desativa `dead_code` que alerta sobre módulo não usado.
    struct Structure(i32);

    // Isto não compilará porque `Structure` não implementa `fmt::Display`
    // println!("This struct `{}` won't print...", Structure(3));
    // TODO ^ Try uncommenting this line

    // Para a Rust 1.58 e acima, podemos capturar diretamente o argumento duma
    // variável envolvente. Tal como acima, isto resultará em "    1",
    // 4 espaços em branco e um "1".
    let number: f64 = 1.0;
    let width: usize = 5;
    println!("{number:>width$}");
}

O std::fmt contém muitas traits que governam a exibição do texto. A forma de base de dois importantes são listados abaixo:

  • fmt::Debug: Usa o marcador {:?}. Formata o texto para fins de depuração.
  • fmt::Display: Usa o marcador {}. Formata o texto duma maneira mais elegante e agradável para o utilizador.

Cá, usamos fmt::Display porque a biblioteca std fornece implementações para estes tipos. Para imprimir texto para tipos personalizados, mais passos são necessários.

Implementar a característica fmt::Display implementa automaticamente a característica ToString que permite-nos converter o tipo para String.

Na linha 46, #[allow(dead_code)] é um atributo que apenas aplica-se ao módulo depois deste.

Atividades

  • Corrija o problema no código acima (vê a FIXME) para que esta execute sem erro.
  • Tente desfazer o comentário da linha que tenta formatar a estrutura Structure (vê TODO)
  • Adicione uma chamada de macro println! que imprime: Pi is roughly 3.142 controlando o número de casas decimais exibido. Para os fins deste exercício, use let pi = 3.141592 como uma estimativa para o pi. (Dica: podemos precisar de consultar a documentação de std::fmt para definir o número de decimais à exibir).

Consulte também:

std::fmt, macros, struct, traits, e dead_code.

Depuração

Todos os tipos que quiserem usar traits de formatação de std::fmt exigem que uma implementação seja imprimível. Implementações automáticas são apenas fornecidas para tipos tais como na biblioteca std. Todos os outros devem ser manualmente implementados de alguma maneira.

A trait de fmt::Debug torna isto muito direto. Todos os tipos podem derive (criar automaticamente) a implementação fmt::Debug. Isto não é verdadeiro para fmt::Display que deve ser implementada manualmente:

#![allow(unused)]
fn main() {
// Esta estrutura não pode ser impressa nem com `fmt::Display` ou
// com `fmt::Debug`.
struct UnPrintable(i32);

// O atributo `derive` cria automaticamente a implementação exigida
// para tornar esta `struct` imprimível com `fmt::Debug`.
#[derive(Debug)]
struct DebugPrintable(i32);
}

Todos os tipos da biblioteca std são automaticamente imprimíveis também com {:?}:

// Derivar a implementação `fmt::Debug` para `Structure`.
// `Structure` é uma estrutura que contém um único `i32`.
#[derive(Debug)]
struct Structure(i32);

// Colocar uma `Structure` dentro da estrutura `Deep`.
// Também a torna imprimível.
#[derive(Debug)]
struct Deep(Structure);

fn main() {
    // Imprimir com `{:?}` é semelhante à com `{}`.
    println!("{:?} months in a year.", 12);
    println!("{1:?} {0:?} is the {actor:?} name.",
             "Slater",
             "Christian",
             actor="actor's");

    // `Structure` é imprimível!
    println!("Now {:?} will print!", Structure(3));

    // O problema com `derive` é que não existe controlo sobre
    // a aparência do resultado. E se quisermos que isto só mostre um `7`?
    println!("Now {:?} will print!", Deep(Structure(7)));
}

Assim fmt::Debug definitivamente torna isto imprimível mas sacrifica alguma elegância. A Rust também fornece "impressão elegante" com {:#?}:

#[derive(Debug)]
struct Person<'a> {
    name: &'a str,
    age: u8
}

fn main() {
    let name = "Peter";
    let age = 27;
    let peter = Person { name, age };

    // Imprimir com elegância
    println!("{:#?}", peter);
}

Um pode manualmente implementar fmt::Display para controlar a exibição.

Consulte também:

attributes, derive, std::fmt, e struct

Exibição

fmt::Debug dificilmente parece compacto e limpo, então é frequentemente vantajoso personalizar a aparência da saída. Isto é feito implementando manualmente a fmt::Display, que usa o marcador de impressão {}. Sua implementação parece-se com:

#![allow(unused)]
fn main() {
// Importar (através de `use`) o módulo `fmt`
// para torná-lo disponível.
use std::fmt;

// Definir uma estrutura para qual `fmt::Display` será implementado.
// Isto é uma estrutura de tupla nomeada `Structure` que contém um `i32`.
struct Structure(i32);

// Para usar o marcador `{}`, a característica `fmt::Display`
// deve ser implementada manualmente para o tipo.
impl fmt::Display for Structure {
    // Esta característica exige `fmt` com esta exata assinatura.
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        // Escrever estritamente o primeiro elemento para a corrente
        // de saída fornecida: `f`. Retorna `fmt::Result` que indica se
        // a operação foi bem-sucedida ou falhou. Nota que `write!` usa
        // sintaxe que é muito semelhante ao `println!`.
        write!(f, "{}", self.0)
    }
}
}

fmt::Display pode ser mais claro do que fmt::Debug mas isto apresenta um problema para a biblioteca std. Como deveriam os tipos ambíguos ser exibidos? Por exemplo, se a biblioteca std implementasse um único estilo para todos os Vec<T>, qual estilo seria? Seria qualquer destes dois?

  • Vec<path>: /:/etc:/home/username:/bin (separa-se nos :)
  • Vec<number>: 1,2,3 (separa-se nas ,)

Não, porque não existe estilo ideal para todos os tipos e a biblioteca std não presume import uma. fmt::Display não é implementada para Vec<T> ou para quaisquer outros contentores genéricos. fmt::Debug deve então ser usado para estes casos genéricos.

Isto não é um problema embora porque para qualquer tipo de contentor novo que não é genérico, fmt::Display pode ser implementado:

use std::fmt; // Importar `fmt`

// Uma estrutura segurando dois números. `Debug` será derivado assim os
// resultados podem ser contrastados com `Display`.
#[derive(Debug)]
struct MinMax(i64, i64);

// Implementar `Display` para `MinMax`.
impl fmt::Display for MinMax {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        // Usar `self.number` para fazer referência à cada ponto de
        // dados posicional.
        write!(f, "({}, {})", self.0, self.1)
    }
}

// Definir uma estrutura onde os campos são nomeáveis para comparação.
#[derive(Debug)]
struct Point2D {
    x: f64,
    y: f64,
}

// De maneira semelhante, implementar `Display` para `Point2D`.
impl fmt::Display for Point2D {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        // Personalizar, assim apenas `x` e `y` são denotados.
        write!(f, "x: {}, y: {}", self.x, self.y)
    }
}

fn main() {
    let minmax = MinMax(0, 14);

    println!("Compare structures:");
    println!("Display: {}", minmax);
    println!("Debug: {:?}", minmax);

    let big_range =   MinMax(-300, 300);
    let small_range = MinMax(-3, 3);

    println!("The big range is {big} and the small is {small}",
             small = small_range,
             big = big_range);

    let point = Point2D { x: 3.3, y: 7.2 };

    println!("Compare points:");
    println!("Display: {}", point);
    println!("Debug: {:?}", point);

    // Erro. Ambos `Debug` e `Display` foram implementados, mas `{:b}`
    // exige que `fmt::Binary` seja implementado. Isto não funcionará.
    // println!("What does Point2D look like in binary: {:b}?", point);
}

Assim, fmt::Display tem sido implementado mas fmt::Binary não tem, e portanto não pode ser usado. std::fmt tem vários tal como traits e cada um exige sua própria implementação. Isto é detalhado mais adiante em std::fmt.

Atividade

Depois de verificar a saída do exemplo acima, use a estrutura Point2D como um guia para adicionar uma estrutura Complex ao exemplo. Quando imprimida da mesma maneira, a saída deveria ser:

Display: 3.3 + 7.2i
Debug: Complex { real: 3.3, imag: 7.2 }

Consulte também:

derive, std::fmt, macros, struct, trait, e use

Caso de Teste: Lista

Implementar fmt::Display para uma estrutura onde os elementos devem cada um ser manipulados sequencialmente é complicado. O problema é que cada write! gera um fmt::Result. A manipulação apropriada disto exige lidar com todos os resultados. A Rust fornece o operador ? para exatamente este propósito.

Usar ? na write! parece-se com isto:

// Tentar `write!` para ver se erra. Se errar, retorne o erro.
// De outro modo continue.
write!(f, "{}", value)?;

Com ? disponível, implementar fmt::Display para uma Vec é simples:

use std::fmt; // Importar o módulo `fmt`.

// Definir uma estrutura nomeada `List` contendo uma `Vec`.
struct List(Vec<i32>);

impl fmt::Display for List {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        // Extrair o valor usando a indexação de tupla,
        // e criar uma referência para `vec`.
        let vec = &self.0;

        write!(f, "[")?;

        // Iterar sobre `v` em `vec` enquanto enumera a 
        // contagem da iteração em `count`.
        for (count, v) in vec.iter().enumerate() {
            // Para cada elemento exceto o primeiro, adicione uma vírgula.
            // Usar o operador `?` para retornar em erros.
            if count != 0 { write!(f, ", ")?; }
            write!(f, "{}", v)?;
        }

        // Fechar o parênteses retos e retornar um valor de `fmt::Result`
        write!(f, "]")
    }
}

fn main() {
    let v = List(vec![1, 2, 3]);
    println!("{}", v);
}

Atividade

Tente mudar o programa para que o índice de cada elemento no vetor seja também imprimido. A nova saída deveriam parecer-se com isto:

[0: 1, 1: 2, 2: 3]

Consulte também:

for, ref, Result, struct, ?, e vec!

Formatação

Nós vimos que a formatação é especificada através de uma sequência de caracteres de formatação:

  • format!("{}", foo) -> "3735928559"
  • format!("0x{:X}", foo) -> "0xDEADBEEF"
  • format!("0o{:o}", foo) -> "0o33653337357"

A mesma variável (foo) pode ser formatada diferentemente dependendo do qual tipo de argumento é usado: X vs o vs não especificado.

Esta funcionalidade de formatação é implementada através das características, e existe uma característica para cada tipo de argumento. A característica de formatação mais comum é Display, que lida com os casos onde o tipo do argumento é deixado não especificado: {} por exemplo.

use std::fmt::{self, Formatter, Display};

struct City {
    name: &'static str,
    // Latitude
    lat: f32,
    // Longitude
    lon: f32,
}

impl Display for City {
    // `f` é um amortecedor, e este método deve escrever a
    // sequência de caracteres de formatação para ela.
    fn fmt(&self, f: &mut Formatter) -> fmt::Result {
        let lat_c = if self.lat >= 0.0 { 'N' } else { 'S' };
        let lon_c = if self.lon >= 0.0 { 'E' } else { 'W' };

        // `write!` é como `format!`, mas escreverá a sequência de caracteres de
        // formatação para um amortecedor (o primeiro argumento).
        write!(f, "{}: {:.3}°{} {:.3}°{}",
               self.name, self.lat.abs(), lat_c, self.lon.abs(), lon_c)
    }
}

#[derive(Debug)]
struct Color {
    red: u8,
    green: u8,
    blue: u8,
}

fn main() {
    for city in [
        City { name: "Dublin", lat: 53.347778, lon: -6.259722 },
        City { name: "Oslo", lat: 59.95, lon: 10.75 },
        City { name: "Vancouver", lat: 49.25, lon: -123.1 },
    ] {
        println!("{}", city);
    }
    for color in [
        Color { red: 128, green: 255, blue: 90 },
        Color { red: 0, green: 3, blue: 254 },
        Color { red: 0, green: 0, blue: 0 },
    ] {
        // Troque isto para usar {} uma vez que tiveres adicionado uma
        // implementação para `fmt::Display`.
        println!("{:?}", color);
    }
}

Tu podes ver uma lista completa de características de formatação e seus tipos de argumentos na documentação da std::fmt.

Atividade

Adicione uma implementação da característica fmt::Display para a estrutura Color acima para que a saída exiba algo como:

RGB (128, 255, 90) 0x80FF5A
RGB (0, 3, 254) 0x0003FE
RGB (0, 0, 0) 0x000000

Duas sugestões caso ficares sem saber o que fazer:

Consulte também:

std::fmt

Primitivos

A Rust fornece acesso à uma grande variedade de primitives. Um amostra inclui:

Tipos Escalares

  • Inteiros com sinais:
  • Inteiros sem sinais:
  • Ponto flutuante
  • Os valores escalares de código universal de char como 'a', 'α' e '∞' (são 4 bytes cada)
  • bool ou é true ou é false
  • O tipo de unitário (), cujo único valor possível é uma tupla vazia: ()

Apesar do valor dum tipo unitário ser uma tupla, não é considerado um tipo composto porque não contém vários valores.

Tipos Compostos

  • Arranjos como [1, 2, 3]
  • Tuplas como (1, true)

As variáveis podem sempre ser anotadas por tipo. Os números podem adicionalmente ser anotados através dum sufixo ou por padrão. Os inteiros predefine para i32 e flutuantes para f64. Nota que a Rust também pode inferir tipos a partir do contexto:

fn main() {
    // Variáveis podem ser anotadas por tipo.
    let logical: bool = true;

    let an_integer   = 5i32; // Suffix annotation Anotação de sufixo

    // Ou um padrão será usado.
    let default_float   = 3.0; // `f64`
    let default_integer = 7;   // `i32`

    // Um tipo também pode ser inferido a partir do contexto.
    let mut inferred_type = 12; // O tipo i64 é inferido a partir da outra linha.
    inferred_type = 4294967296i64;

    // Um valor da variável mutável pode ser mudado.
    let mut mutable = 12; // `i32` mutável
    mutable = 21;

    // Erro! O tipo duma variável não pode ser mudado.
    mutable = true;

    // As variáveis podem ser sobrescritas com obscurecimento.
    let mutable = true;
}

Consulte também:

a biblioteca std, mut, inference, e shadowing

Literais e Operadores

Os inteiros 1, flutuantes 1.2, caracteres 'a', sequências de caracteres "abc", booleanos true e o tipo unitário () podem ser expressados usando literais.

Os inteiros podem, alternativamente, ser expressados usando a notação hexadecimal, octal ou binária usando estes prefixos respetivamente: 0x, 0o ou 0b.

Os sublinhados podem ser inseridos nos literais numéricos para melhorar a legibilidade, por exemplo 1_000 é o mesmo que 1000, e 0.000_001 é o mesmo que 0.000001.

A Rust suporta a notação cientifica e, por exemplo 1e6, 7.6e-4. O tipo associado é f64.

Nós precisamos dizer ao compilador o tipo dos literais que usamos. Por agora, usaremos o sufixo u32 para indicar que o literal é um inteiro de 32-bit sem sinal, e o sufixo i32 para indicar que é um inteiro de 32-bit com sinal.

Os operadores disponíveis e suas precedências na Rust são semelhantes às outras linguagens parecidas com C:

fn main() {
    // Adição de inteiro
    println!("1 + 2 = {}", 1u32 + 2);

    // Subtração de inteiro
    println!("1 - 2 = {}", 1i32 - 2);
    // TODO ^ Tente mudar `1i32` para `1u32` para veres a importância do tipo

    // Notação cientifica
    println!("1e4 is {}, -2.5e-3 is {}", 1e4, -2.5e-3);

    // Lógica booleana de curto-circuito
    println!("true AND false is {}", true && false);
    println!("true OR false is {}", true || false);
    println!("NOT true is {}", !true);

    // Operações bitwise
    println!("0011 AND 0101 is {:04b}", 0b0011u32 & 0b0101);
    println!("0011 OR 0101 is {:04b}", 0b0011u32 | 0b0101);
    println!("0011 XOR 0101 is {:04b}", 0b0011u32 ^ 0b0101);
    println!("1 << 5 is {}", 1u32 << 5);
    println!("0x80 >> 2 is 0x{:x}", 0x80u32 >> 2);

    // Use sublinhados para melhorar a legibilidade!
    println!("One million is written as {}", 1_000_000u32);
}

Tuplas

Uma tupla é uma coleção de valores de tipos diferentes. As tuplas são construídas usando parênteses () e cada tupla por si mesma é um valor com a assinatura de tipo (T1, T2, ...), onde T1, T2 são os tipos dos seus membros. As funções podem usar tuplas para retornarem vários valores, visto que as tuplas podem segurar qualquer número de valores:

// As tuplas podem ser usadas como argumentos de função e retornar valores.
fn reverse(pair: (i32, bool)) -> (bool, i32) {
    // `let` pode ser usado para vincular os membros duma tupla às variáveis.
    let (int_param, bool_param) = pair;

    (bool_param, int_param)
}

// A seguinte estrutura é para a atividade.
#[derive(Debug)]
struct Matrix(f32, f32, f32, f32);

fn main() {
    // Uma tupla com um grupo de tipos diferentes.
    let long_tuple = (1u8, 2u16, 3u32, 4u64,
                      -1i8, -2i16, -3i32, -4i64,
                      0.1f32, 0.2f64,
                      'a', true);

    // Os valores podem ser extraídos da tupla usando indexação de tupla.
    println!("Long tuple first value: {}", long_tuple.0);
    println!("Long tuple second value: {}", long_tuple.1);

    // As tuplas podem ser membros de tupla.
    let tuple_of_tuples = ((1u8, 2u16, 2u32), (4u64, -1i8), -2i16);

    // As tuplas são imprimíveis.
    println!("tuple of tuples: {:?}", tuple_of_tuples);

    // Mas tuplas longas (mas de 12 elementos) não podem ser imprimidas.
    //let too_long_tuple = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13);
    //println!("Too long tuple: {:?}", too_long_tuple);
    // TODO ^ Tente desfazer o comentário das 2 linhas acima
    // para veres o erro do compilador

    let pair = (1, true);
    println!("Pair is {:?}", pair);

    println!("The reversed pair is {:?}", reverse(pair));

    // Para criar tuplas de um elemento, a vírgula é obrigatória para
    // distingui-las dum literal envolvido por parênteses.
    println!("One element tuple: {:?}", (5u32,));
    println!("Just an integer: {:?}", (5u32));

    // As tuplas podem ser desconstruídas para criar vínculos.
    let tuple = (1, "hello", 4.5, true);

    let (a, b, c, d) = tuple;
    println!("{:?}, {:?}, {:?}, {:?}", a, b, c, d);

    let matrix = Matrix(1.1, 1.2, 2.1, 2.2);
    println!("{:?}", matrix);
}

Atividade

  1. Recapitulação: Adicione a característica fmt::Display à estrutura Matrix no exemplo acima, para que se mudares da impressão do formato de depuração {:?} para o formato de exibição {}, consigas ver a seguinte saída:

    ( 1.1 1.2 )
    ( 2.1 2.2 )
    

    Podemos consultar o exemplo para impressão de exibição.

  2. Adicione uma função transpose usando a função reverse como molde, que aceita uma matriz como argumento, e retorna uma matriz na qual dos elementos têm sido trocados. Por exemplo:

    println!("Matrix:\n{}", matrix);
    println!("Transpose:\n{}", transpose(matrix));

    Resulta na saída:

    Matrix:
    ( 1.1 1.2 )
    ( 2.1 2.2 )
    Transpose:
    ( 1.1 2.1 )
    ( 1.2 2.2 )
    

Vetores e Pedaços

Um vetor é uma coleção de objetos do mesmo tipo T, armazenados na memória contígua. Os vetores são criados usando parênteses retos [], e o seu comprimento, que é conhecido em tempo de compilação, é parte da sua assinatura de tipo [T; length].

Os pedaços são semelhantes aos vetores, mas seu comprimento não é conhecido em tempo de compilação. Ao invés disto, um pedaço é um objeto de duas palavras; a primeira palavra é um ponteiro para o dado, e a segunda palavra o comprimento do pedaço. O tamanho da palavra é o mesmo que não ter tamanho usize, determinado pela arquitetura do processador, por exemplo, 64 bits numa x86-64. Os pedaços podem ser usados para pedir emprestado uma seção dum vetor e têm a assinatura de tipo &[T]:

use std::mem;

// Esta função pedi emprestado um pedaço.
fn analyze_slice(slice: &[i32]) {
    println!("First element of the slice: {}", slice[0]);
    println!("The slice has {} elements", slice.len());
}

fn main() {
    // Vetor de tamanho fixo (assinatura de tipo é supérflua).
    let xs: [i32; 5] = [1, 2, 3, 4, 5];

    // Todos os elementos podem ser inicializados para o mesmo valor.
    let ys: [i32; 500] = [0; 500];

    // A indexação começa em 0.
    println!("First element of the array: {}", xs[0]);
    println!("Second element of the array: {}", xs[1]);

    // `len` retorna a contagem de elementos no vetor.
    println!("Number of elements in array: {}", xs.len());

    // Os vetores são pilhas alocadas.
    println!("Array occupies {} bytes", mem::size_of_val(&xs));

    // Os vetores podem ser pedidos emprestado como pedaços.
    println!("Borrow the whole array as a slice.");
    analyze_slice(&xs);

    // Os pedaços podem apontar para uma seção dum vetor.
    // Eles são da forma [starting_index..ending_index].
    // `starting_index` é a primeira posição no pedaço.
    // `ending_index` é a última posição no pedaço.
    println!("Borrow a section of the array as a slice.");
    analyze_slice(&ys[1 .. 4]);

    // Exemplo de pedaço vazio `&[]`:
    let empty_array: [u32; 0] = [];
    assert_eq!(&empty_array, &[]);
    assert_eq!(&empty_array, &[][..]); // O mesmo mas mais verboso

    // Os vetores podem ser acessados seguramente usando `.get`, que retorna
    // uma `Option`. Isto pode ser correspondido como mostrado abaixo, ou usado
    // com `.expect()` se gostarias que o programa saísse com uma mensagem
    // agradável ao invés de felizmente continuar.
    for i in 0..xs.len() + 1 { // Oops, um elemento longe de mais!
        match xs.get(i) {
            Some(xval) => println!("{}: {}", i, xval),
            None => println!("Slow down! {} is too far!", i),
        }
    }

    // Indexação fora do limite no vetor causa erro de tempo de compilação.
    //println!("{}", xs[5]);
    // // Indexação fora do limite no pedaço causa erro de tempo de execução.
    //println!("{}", xs[..][5]);
}

Tipos Personalizados

Os tipos de dados personalizados da Rust são principalmente formados através de duas palavras-chave:

  • struct: define uma estrutura
  • enum: define uma enumeração

As constantes também podem ser criadas através das palavras-chave const e static.

Estruturas

Existem três tipos de estruturas ("structs") que podem ser criadas usando a palavra-chave struct:

  • Estruturas de tupla, que são, basicamente, tuplas nomeadas.
  • A estruturas de C clássicas,
  • Estruturas unitárias, que são sem campo, são úteis para genéricos.
// Um atributo para esconder avisos de código não usado.
#![allow(dead_code)]

#[derive(Debug)]
struct Person {
    name: String,
    age: u8,
}

// Uma estrutura unitária
struct Unit;

// Um estrutura de tupla
struct Pair(i32, f32);

// Uma estrutura com dois campos
struct Point {
    x: f32,
    y: f32,
}

// As estruturas podem ser reutilizadas como campos duma outra estrutura
struct Rectangle {
    // Um retângulo pode ser especificado pela posição dos cantos
    // superior esquerdo e inferior direito no espaço.
    top_left: Point,
    bottom_right: Point,
}

fn main() {
    // Criar a estrutura com a abreviatura de inicialização de campo
    let name = String::from("Peter");
    let age = 27;
    let peter = Person { name, age };

    // Imprimir em depuração a estrutura
    println!("{:?}", peter);

    // Instanciar um `Point`
    let point: Point = Point { x: 10.3, y: 0.4 };

    // Acessar os campos do ponto
    println!("point coordinates: ({}, {})", point.x, point.y);

    // Criar um novo ponto usando a sintaxe de atualização de estrutura
    // para usar os campos do nosso outro
    let bottom_right = Point { x: 5.2, ..point };

    // `bottom_right.y` será o mesmo que `point.y` porque usamos este campo
    // a partir de `point`
    println!("second point: ({}, {})", bottom_right.x, bottom_right.y);

    // Desestruturar o ponto usando uma vínculo `let`
    let Point { x: left_edge, y: top_edge } = point;

    let _rectangle = Rectangle {
        // A instanciação da estrutura também é uma expressão
        top_left: Point { x: left_edge, y: top_edge },
        bottom_right: bottom_right,
    };

    // Instanciar uma estrutura unitária
    let _unit = Unit;

    // Instanciar uma estrutura de tupla
    let pair = Pair(1, 0.1);

    // Acessar os campos duma estrutura de tupla
    println!("pair contains {:?} and {:?}", pair.0, pair.1);

    // Desestruturar uma estrutura de tupla
    let Pair(integer, decimal) = pair;

    println!("pair contains {:?} and {:?}", integer, decimal);
}

Atividade

  1. Adicione uma função rect_area que calcula a área dum Rectangle (tente usar desestruturação encaixada).
  2. Adicione uma função square que recebe um Point e um f32 como argumentos, e retorna um Rectangle com o seu canto superior esquerdo no ponto, e uma largura e altura correspondendo ao f32.

Consulte também

attributes, e desestruturação

Enumerações

A palavra-chave enum permite a criação dum tipo que pode ser um das poucas variantes diferentes. Qualquer que é válido como uma struct é também válido numa enum.

// Criar uma `enum` para classificar um evento de Web.
// Nota como nomes e o tipo da informação juntos especificam a variante:
// `PageLoad != PageUnload` e `KeyPress(char) != Paste(String)`.
// Cada um é diferente e independente.
enum WebEvent {
    // Uma variante `enum` pode ser qualquer`unit-like`,
    PageLoad,
    PageUnload,
    // como estruturas de tupla,
    KeyPress(char),
    Paste(String),
    // ou estruturas parecidas de C.
    Click { x: i64, y: i64 },
}

// Uma função que recebe uma enumeração `WebEvent` como
// um argumento e não retorna nada.
fn inspect(event: WebEvent) {
    match event {
        WebEvent::PageLoad => println!("page loaded"),
        WebEvent::PageUnload => println!("page unloaded"),
        // Desestruturar `c` a partir duma variante `enum`.
        WebEvent::KeyPress(c) => println!("pressed '{}'.", c),
        WebEvent::Paste(s) => println!("pasted \"{}\".", s),
        // Desestruturar `Click` para `x` e `y`.
        WebEvent::Click { x, y } => {
            println!("clicked at x={}, y={}.", x, y);
        },
    }
}

fn main() {
    let pressed = WebEvent::KeyPress('x');
    // `to_owned()` cria uma `String` própria a partir
    // dum pedaço de sequência de caracteres.
    let pasted  = WebEvent::Paste("my text".to_owned());
    let click   = WebEvent::Click { x: 20, y: 80 };
    let load    = WebEvent::PageLoad;
    let unload  = WebEvent::PageUnload;

    inspect(pressed);
    inspect(pasted);
    inspect(click);
    inspect(load);
    inspect(unload);
}

Pseudónimos de Tipo

Se usas um pseudónimo de tipo, podes fazer referência a cada variante de enumeração através do pseudónimo. Isto pode ser útil se o nome da enumeração for muito longo ou muito genérico, e queres renomeá-lo.

enum VeryVerboseEnumOfThingsToDoWithNumbers {
    Add,
    Subtract,
}

// Creates a type alias
// Cria um pseudónimo de tipo
type Operations = VeryVerboseEnumOfThingsToDoWithNumbers;

fn main() {
    // Nós podemos fazer referência a cada variante através do pseudónimo,
    // e não seu nome longo e inconveniente.
    let x = Operations::Add;
}

O lugar mais comum que verás isto é em blocos impl usando o pseudónimo Self.

enum VeryVerboseEnumOfThingsToDoWithNumbers {
    Add,
    Subtract,
}

impl VeryVerboseEnumOfThingsToDoWithNumbers {
    fn run(&self, x: i32, y: i32) -> i32 {
        match self {
            Self::Add => x + y,
            Self::Subtract => x - y,
        }
    }
}

Para saberes mais sobre as enumerações e pseudónimos de tipo, podes ler o relatório de estabilização a partir de quando esta funcionalidade foi estabilizada para Rust.

Consulte também:

match, fn, e String, RFC dos "Pseudónimos de tipo e variantes de enumeração" RFC

use

A declaração use pode ser usado então a delimitação do âmbito não é necessária:

// Um atributo para esconder avisos de código não usado.
#![allow(dead_code)]

enum Status {
    Rich,
    Poor,
}

enum Work {
    Civilian,
    Soldier,
}

fn main() {
    // Usar explicitamente cada nome assim estão disponíveis
    // sem a delimitação manual.
    use crate::Status::{Poor, Rich};
    // Usar automaticamente cada nome dentro de `Work`.
    use crate::Work::*;

    // Equivalente ao `Status::Poor`.
    let status = Poor;
    // Equivalente ao `Work::Civilian`.
    let work = Civilian;

    match status {
        // Nota a falta de delimitação de âmbito por causa do
        // uso explícito acima.
        Rich => println!("The rich have lots of money!"),
        Poor => println!("The poor have no money..."),
    }

    match work {
        // Nota novamente a falte de delimitação do âmbito.
        Civilian => println!("Civilians work!"),
        Soldier  => println!("Soldiers fight!"),
    }
}

Consulte também:

match e use

Parecida com C

enum também pode ser usado como enumerações parecidas com C.

// Um atributo para esconder os avisos de código não usado.
#![allow(dead_code)]

// enumeração com discriminador implícito (começa em 0)
enum Number {
    Zero,
    One,
    Two,
}

// enumeração com discriminador explícito
enum Color {
    Red = 0xff0000,
    Green = 0x00ff00,
    Blue = 0x0000ff,
}

fn main() {
    // `enums` podem ser moldados como inteiros.
    println!("zero is {}", Number::Zero as i32);
    println!("one is {}", Number::One as i32);

    println!("roses are #{:06x}", Color::Red as i32);
    println!("violets are #{:06x}", Color::Blue as i32);
}

Consulte também:

moldagem

Caso de Teste: Lista Ligada

Um maneira comum de implementar uma lista ligada é através de enums:

use crate::List::*;

enum List {
    // Cons: estrutura de tipo que envolve um elemento e 
    // um ponteiro para o próximo nó.
    Cons(u32, Box<List>),
    // Nil: Um nó que significa o fim da lista ligada
    Nil,
}

// Os métodos podem ser anexados à numa enumeração
impl List {
    // Criar uma lista vazia
    fn new() -> List {
        // `Nil` tem o tipo `List`
        Nil
    }

    // Consumir uma lista, e retornar a mesma lista com
    // um novo elemento na sua frente
    fn prepend(self, elem: u32) -> List {
        // `Cons` também tem o tipo `List`
        Cons(elem, Box::new(self))
    }

    // Retornar o comprimento da lista
    fn len(&self) -> u32 {
        // `self` precisa ser correspondido, porque o comportamento deste método
        // depende da variante do `self`.
        // `self` tem o tipo `&List`, e `*self` tem o tipo `List`,
        // correspondência dum tipo concreto `T` é preferida sobre 
        // uma correspondência numa referência `&T`, depois da Rust 2018 
        // podes usar `self` aqui e também perseguir (sem referência) abaixo,
        // Rust inferirá `&s` e referenciará a cauda.
        // See https://doc.rust-lang.org/edition-guide/rust-2018/ownership-and-lifetimes/default-match-bindings.html
        match *self {
            // Não possível tomar posse da cauda, porque `self` é emprestado;
            // ao invés disto recebe uma referência à cauda
            Cons(_, ref tail) => 1 + tail.len(),
            // Caso de Base: Uma lista vaziam tem comprimento zero
            Nil => 0
        }
    }

    // Retornar a representação da lista como uma
    // sequência de caracteres (monte alocado)
    fn stringify(&self) -> String {
        match *self {
            Cons(head, ref tail) => {
                // `format!` é semelhante à `print!`, mas retorna uma
                // sequência de caracteres de monte alocado de impressão
                // para a consola.
                format!("{}, {}", head, tail.stringify())
            },
            Nil => {
                format!("Nil")
            },
        }
    }
}

fn main() {
    // Criar uma lista ligada vazia
    let mut list = List::new();

    // Adicionar ao começo da lista alguns elementos
    list = list.prepend(1);
    list = list.prepend(2);
    list = list.prepend(3);

    // Mostrar o estado final da lista
    println!("linked list has length: {}", list.len());
    println!("{}", list.stringify());
}

Consulte também:

Box e métodos

Constantes

A Rust tem dois tipos diferentes de constantes que podem ser declarados em qualquer âmbito incluindo global. Ambos exigem anotação de tipo explícita:

  • const: Um valor não modificável (o caso comum).
  • static: Uma variável possivelmente mutável com a vida útil de 'static.
    • A vida útil estática é inferida e não precisa de ser especificada.
    • O acesso ou modificação duma variável estática mutável é unsafe.
// As globais são declaradas fora de todos os outros âmbitos.
static LANGUAGE: &str = "Rust";
const THRESHOLD: i32 = 10;

fn is_big(n: i32) -> bool {
    // Acessar constante em alguma função
    n > THRESHOLD
}

fn main() {
    let n = 16;

    // Acessar constante na linha principal
    println!("This is {}", LANGUAGE);
    println!("The threshold is {}", THRESHOLD);
    println!("{} is {}", n, if is_big(n) { "big" } else { "small" });

    // Erro! Não possível modificar uma `const`.
    THRESHOLD = 5;
    // FIXME ^ Comente esta linha
}

Consulte também:

O RFC de const/static, vida útil da 'static.

Vínculos de Variável

A Rust fornece segurança de tipo através da definição de tipos estáticos. Os vínculos de variável pode ter o seu tipo anotado quando declarados. No entanto, na maioria dos casos, o compilador será capaz de inferir o tipo da variável a partir do contexto, reduzindo grandemente o fardo da anotação.

Os valores (tais como literais) podem ser vinculados às variáveis, usando o vínculo de let.

fn main() {
    let an_integer = 1u32;
    let a_boolean = true;
    let unit = ();

    // copiar `an_integer` para `copied_integer`
    let copied_integer = an_integer;

    println!("An integer: {:?}", copied_integer);
    println!("A boolean: {:?}", a_boolean);
    println!("Meet the unit value: {:?}", unit);

    // O compilador avisa sobre vínculos de variáveis não usados;
    // estes avisos podem ser silenciados prefixando o nome da variável
    // com um sublinhado.
    let _unused_variable = 3u32;

    let noisy_unused_variable = 2u32;
    // FIXME ^ Prefixe com um sublinhado para suprimir o aviso
    // Nota que os avisos podem não ser exibidos num navegador

Mutabilidade

Os vínculos de variável são imutáveis por padrão, mas isto pode ser sobreposto usando o modificador mut.

fn main() {
    let _immutable_binding = 1;
    let mut mutable_binding = 1;

    println!("Before mutation: {}", mutable_binding);

    // Ok
    mutable_binding += 1;

    println!("After mutation: {}", mutable_binding);

    // Erro! Não é possível atribuir um novo valor à uma variável imutável
    _immutable_binding += 1;
}

O compilador lançará um diagnóstico detalhado sobre os erros de mutabilidade.

Âmbito e Obscurecimento

Os vínculos de variável têm um âmbito, e são obrigados a viver num bloco. Um bloco é uma coleção de declarações fechadas por chavetas {}.

fn main() {
    // Este vínculo vive na função principal
    let long_lived_binding = 1;

    // Isto é um bloco, e tem um âmbito mais pequeno do que a função principal
    {
        // Este vínculo apenas existe neste bloco
        let short_lived_binding = 2;

        println!("inner short: {}", short_lived_binding);
    }
    // Fim do bloco

    // Erro! `short_lived_binding` não existe neste âmbito
    println!("outer short: {}", short_lived_binding);
    // FIXME ^ Comente esta linha

    println!("outer long: {}", long_lived_binding);
}

Além disto, obscurecimento de variável é permitido.

fn main() {
    let shadowed_binding = 1;

    {
        println!("before being shadowed: {}", shadowed_binding);

        // Este vínculo *obscurece* aquela que está no exterior
        let shadowed_binding = "abc";

        println!("shadowed in inner block: {}", shadowed_binding);
    }
    println!("outside inner block: {}", shadowed_binding);

    // Este vínculo *obscurece* o vínculo anterior
    let shadowed_binding = 2;
    println!("shadowed in outer block: {}", shadowed_binding);
}

Declarar primeiro

É possível declarar vínculos de variável primeiro, e inicializa-los depois. No entanto, esta forma é raramente usada, visto que pode conduzir à uso de variáveis não inicializadas.

fn main() {
    // Declarar um vínculo de variável
    let a_binding;

    {
        let x = 2;

        // Inicializar o vínculo
        a_binding = x * x;
    }

    println!("a binding: {}", a_binding);

    let another_binding;

    // Erro! Uso de vínculo não inicializado
    println!("another binding: {}", another_binding);
    // FIXME ^ Comenta esta linha

    another_binding = 1;

    println!("another binding: {}", another_binding);
}

O compilador proíbe o uso de variáveis não inicializadas, visto que isto conduziria à comportamento não definido.

Congelação

Quando o dado estiver vinculado ao mesmo nome imutavelmente, ele também congela. O dado congelado não pode ser modificado até o vínculo imutável sair do âmbito:

fn main() {
    let mut _mutable_integer = 7i32;

    {
        // Obscurecimento pela `_mutable_integer` imutável
        let _mutable_integer = _mutable_integer;

        // Erro! `_mutable_integer` está congelado neste âmbito
        _mutable_integer = 50;
        // FIXME ^ Comente esta linha

        // `_mutable_integer` sai fora do âmbito
    }

    // Ok! `_mutable_integer` não está congelado neste âmbito
    _mutable_integer = 3;
}

Tipos

A Rust fornece vários mecanismos para mudar ou definir o tipo de tipos primitivos e definidos pelo utilizador. As seguintes seções cobrem:

Moldagem

A Rust não fornece nenhuma conversão (coerção) de tipo implícita entre tipos primitivos. Mas, a conversão (moldagem) de tipo explícita pode ser realizada usando a palavra-chave as.

As regras para converter entre tipos integrais segue geralmente as convenções da C, exceto em casos onde a C tem comportamento não definido. O comportamento de todos os moldes entre tipos integrais é bem definido na Rust.

// Suprimir todos os avisos dos moldes que transbordam.
#![allow(overflowing_literals)]

fn main() {
    let decimal = 65.4321_f32;

    // Error! Sem conversão implícita
    let integer: u8 = decimal;
    // FIXME ^ Comente esta linha

    // Conversão explícita
    let integer = decimal as u8;
    let character = integer as char;

    // Error! Existem limitações nas regras de conversão.
    // Um flutuante não pode ser diretamente convertido à um carácter.
    let character = decimal as char;
    // FIXME ^ Comente esta linha

    println!("Casting: {} -> {} -> {}", decimal, integer, character);

    // quando moldamos qualquer valor à um tipo sem sinal, T,
    // T::MAX + 1 é adicionado ou subtraído até o valor caber num novo tipo

    // 1000 já cabe num u16
    println!("1000 as a u16 is: {}", 1000 as u16);

    // 1000 - 256 - 256 - 256 = 232
    // Nos bastidores, o primeiro os bits menos significativos de 8 são mantidos,
    // enquanto o resto perto do bit mais significativo é truncado.
    println!("1000 as a u8 is : {}", 1000 as u8);
    // -1 + 256 = 255
    println!("  -1 as a u8 is : {}", (-1i8) as u8);

    // Para os números positivos, isto é o mesmo que modulus
    println!("1000 mod 256 is : {}", 1000 % 256);

    // Quando moldamos à um tipo com sinal, o resultado (bitwise) é o mesmo que
    // a primeira moldagem ao tipo sem sinal correspondente.
    // Se o bit mais significativo deste valor for 1, então o valor é negativo.

    // A menos que já caiba, claro.
    println!(" 128 as a i16 is: {}", 128 as i16);

    // 128 as u8 -> 128,
    // cujo valor na representação de complemento dos dois de 8-bit é:
    println!(" 128 as a i8 is : {}", 128 as i8);

    // repetindo o exemplo acima
    // 1000 as u8 -> 232
    println!("1000 as a u8 is : {}", 1000 as u8);
    // e o valor de 232 na representação de complemento dos dois de 8-bit é -24
    println!(" 232 as a i8 is : {}", 232 as i8);

    // Desde a Rust 1.45, a palavra-chave `as` realiza um *molde de saturação*
    // quando moldamos de flutuante à inteiro. Se o valor de ponto flutuante
    // excede o saldo máximo ou é menos do que o saldo mínimo, o valor
    // retornado será igual ao salto cruzado.

    // 300.0 as u8 é 255
    println!(" 300.0 as u8 is : {}", 300.0_f32 as u8);
    // -100.0 as u8 é 0
    println!("-100.0 as u8 is : {}", -100.0_f32 as u8);
    // nan as u8 é 0
    println!("   nan as u8 is : {}", f32::NAN as u8);

    // Este comportamento incorre num pequeno custo de execução e
    // pode ser evitado com métodos inseguros, no entanto o resultado pode
    // transbordar e retornar **valores incertos**. Use estes métodos sabiamente:
    unsafe {
        // 300.0 as u8 é 44
        println!(" 300.0 as u8 is : {}", 300.0_f32.to_int_unchecked::<u8>());
        // -100.0 as u8 é 156
        println!("-100.0 as u8 is : {}", (-100.0_f32).to_int_unchecked::<u8>());
        // nan as u8 é 0
        println!("   nan as u8 is : {}", f32::NAN.to_int_unchecked::<u8>());
    }
}

Literais

Os literais numéricos podem ser anotados por tipo adicionado o tipo como um sufixo. Como um exemplo, para especificar que o literal 42 deveria ter o tipo i32, escreva 42i32.

O tipo dos literais numéricos sem sufixo depende de como são usados. Se não existir nenhuma restrição, o compilador usará i32 para os inteiros, e f64 para os números de ponto flutuante.

fn main() {
    // Literais com sufixo, seus tipos são conhecidos na inicialização
    let x = 1u8;
    let y = 2u32;
    let z = 3f32;

    // Literais sem sufixo, seus tipos dependem de como são usados
    let i = 1;
    let f = 1.0;

    // `size_of_val` retorna o tamanho duma variável em bytes
    println!("size of `x` in bytes: {}", std::mem::size_of_val(&x));
    println!("size of `y` in bytes: {}", std::mem::size_of_val(&y));
    println!("size of `z` in bytes: {}", std::mem::size_of_val(&z));
    println!("size of `i` in bytes: {}", std::mem::size_of_val(&i));
    println!("size of `f` in bytes: {}", std::mem::size_of_val(&f));
}

Existem alguns conceitos usados no código anterior que ainda não foram explicados, cá está uma breve explicação para os leitores impacientes:

  • std::mem::size_of_val é uma função, mas chamada com o seu caminho completo. O código pode ser separado em unidades lógicas chamadas de módulos. Neste caso, a função size_of_val é definida no módulo mem, e o módulo mem é definida no caixote std. Para mais detalhes, consulte os módulos e caixotes.

Inferência

O motor da inferência de tipo é muito inteligente. Ele faz mais do que olhar o tipo da expressão do valor durante uma inicialização. Ele também olha em como a variável é usado mais tarde para inferir o seu tipo. Cá está um exemplo avançado da inferência de tipo:

fn main() {
    // Por causa da anotação, o compilador sabe que `elem` tem o tipo u8.
    let elem = 5u8;

    // Criar um vetor vazio (uma tabela que cresce).
    let mut vec = Vec::new();
    // Neste momento o compilador não sabe o tipo exato de `vec`,
    // apenas sabe que é um vetor de alguma coisa (`Vec<_>`).

    // Inserir `elem` no vetor.
    vec.push(elem);
    // Aha! Agora o compilador sabe que `vec` é um vetor de `u8` (`Vec<u8>`)
    // TODO ^ Tente comentar a linha `vec.push(elem)`

    println!("{:?}", vec);
}

Nenhuma anotação de tipo de variáveis foi necessária, o compilador está feliz e consequentemente o programador!

Pseudónimos

A declaração type pode ser usada para dar um novo nome à um tipo existente. Os tipos devem ter nomes com o seguinte padrão UpperCamelCase, ou o compilador levantará um aviso. As exceções à esta regra são os tipos primitivos: usize, f32, etc.

// `NanoSecond`, `Inch`, e `U64` são novos nomes para `u64`.
type NanoSecond = u64;
type Inch = u64;
type U64 = u64;

fn main() {
    // `NanoSecond` = `Inch` = `U64` = `u64`.
    let nanoseconds: NanoSecond = 5 as U64;
    let inches: Inch = 2 as U64;

    // Nota que os pseudónimos de tipo *não* fornecem qualquer segurança de
    // tipo adicional, porque os pseudónimos *não* são novos tipos
    println!("{} nanoseconds + {} inches = {} unit?",
             nanoseconds,
             inches,
             nanoseconds + inches);
}

O uso principal de pseudónimos é reduzir a complexidade; por exemplo o tipo io::Result<T> é um pseudónimo para o tipo Result<T, io::Error>.

Consulte também:

Atributos

Conversão

Os tipos primitivos podem ser convertidos à um ao outro através da moldagem.

A Rust aborda a conversão entre tipos personalizados (por exemplo, struct e enum) com o uso de características. As conversões genéricas usarão as características From e Into. No entanto existem aquelas mais específicas para casos mais comuns, em especial quando convertemos para e de String.

From e Into

As características From e Into estão inerentemente ligadas, e isto é efetivamente parte da sua implementação. Se fores capaz de converter o tipo A a partir do tipo B, então deveria ser fácil acreditar que deveríamos ser capaz de converter o tipo B para o tipo A.

From

A característica From permite um tipo definir como criar-se a partir doutro tipo, daí fornecendo um mecanismo muito simples para converter entre vários tipos. Existem numerosas implementações desta característica dentro da biblioteca padrão para conversão de tipos primitivos e comuns.

Por exemplo podemos facilmente converter uma str numa String:

#![allow(unused)]
fn main() {
let my_str = "hello";
let my_string = String::from(my_str);
}

Nós podemos fazer algo semelhante para definir uma conversão para o nosso próprio tipo.

use std::convert::From;

#[derive(Debug)]
struct Number {
    value: i32,
}

impl From<i32> for Number {
    fn from(item: i32) -> Self {
        Number { value: item }
    }
}

fn main() {
    let num = Number::from(30);
    println!("My number is {:?}", num);
}

Into

A característica Into é simplesmente o recíproco da característica From. Isto é, se tiveres implementado a característica From para o teu tipo, Into o chamará quando necessário.

O uso da característica Into normalmente exigirá especificação do tipo para converter conforme o compilador for incapaz de determinar isto na maioria das vezes. No entanto, isto é um pequeno compromisso considerando que recebemos a funcionalidade gratuitamente:

use std::convert::From;

#[derive(Debug)]
struct Number {
    value: i32,
}

impl From<i32> for Number {
    fn from(item: i32) -> Self {
        Number { value: item }
    }
}

fn main() {
    let int = 5;
    // Tente remover a anotação de tipo
    let num: Number = int.into();
    println!("My number is {:?}", num);
}

TryFrom e TryInto

Semelhante ao From e Into, TryFrom e TryInto são características genéricas para converter entre tipos. Ao contrário de From e Into, as características TryFrom e TryInto são usadas para conversões falíveis, e como tal, retornam Result.

use std::convert::TryFrom;
use std::convert::TryInto;

#[derive(Debug, PartialEq)]
struct EvenNumber(i32);

impl TryFrom<i32> for EvenNumber {
    type Error = ();

    fn try_from(value: i32) -> Result<Self, Self::Error> {
        if value % 2 == 0 {
            Ok(EvenNumber(value))
        } else {
            Err(())
        }
    }
}

fn main() {
    // TryFrom

    assert_eq!(EvenNumber::try_from(8), Ok(EvenNumber(8)));
    assert_eq!(EvenNumber::try_from(5), Err(()));

    // TryInto

    let result: Result<EvenNumber, ()> = 8i32.try_into();
    assert_eq!(result, Ok(EvenNumber(8)));
    let result: Result<EvenNumber, ()> = 5i32.try_into();
    assert_eq!(result, Err(()));
}

Para e a partir de Sequências de Caracteres

Converter para Sequência de Caracteres

Converter qualquer tipo para uma String é tão simples quando implementar a característica ToString para o tipo. Ao invés de fazer isto diretamente, deves implementar a característica fmt::Display que auto-magicamente fornece ToString e também permite imprimir o tipo conforme discutidos na seção print!.

use std::fmt;

struct Circle {
    radius: i32
}

impl fmt::Display for Circle {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "Circle of radius {}", self.radius)
    }
}

fn main() {
    let circle = Circle { radius: 6 };
    println!("{}", circle.to_string());
}

Analisando uma Sequência de Caracteres

Um dos tipos mais comum a converter para uma sequência de caracteres é um número. A abordagem idiomática para isto é usar a função parse e ou organizar para a inferência de tipo ou especificar o tipo a analisar usando a sintaxe 'turbofish'. Ambas alternativas são mostradas no seguinte exemplo.

Isto converterá a sequência de caracteres para o tipo especificado enquanto a característica FromStr é implementada para este tipo. Isto é implementado para numerosos tipos dentro da biblioteca padrão. Para obter esta funcionalidade num tipo definido pelo utilizador simplesmente implemente a característica FromStr para este tipo.

fn main() {
    let parsed: i32 = "5".parse().unwrap();
    let turbo_parsed = "10".parse::<i32>().unwrap();

    let sum = parsed + turbo_parsed;
    println!("Sum: {:?}", sum);
}

Expressões

Um programa de Rust é (maioritariamente) composto duma série de declarações e expressões:

fn main() {
    // declaração
    // declaração
    // declaração
}

Existem alguns tipos de declarações na Rust. Os dois mais comuns são declaração dum vínculo de variável, e o uso dum ; com expressão:

fn main() {
    // vínculo de variável
    let x = 5;

    // expressão;
    x;
    x + 1;
    15;
}

Os blocos são também expressões, então podem ser usados como valores em atribuições. A última expressão no bloco serão atribuídos à expressão do local tal como uma variável local. No entanto, se a última expressão do bloco termina com um sinal de ponto e vírgula, o valor de retorno será ():

fn main() {
    let x = 5u32;

    let y = {
        let x_squared = x * x;
        let x_cube = x_squared * x;

        // Esta expressão será atribuída à `y`
        x_cube + x_squared + x
    };

    let z = {
        // O ponto e vírgula suprime esta expressão e `()` é atribuído à `z`
        2 * x;
    };

    println!("x is {:?}", x);
    println!("y is {:?}", y);
    println!("z is {:?}", z);
}

Fluxo de Controlo

Uma parte integral de qualquer linguagem de programação são as maneiras de modificar o fluxo de controlo: if ou else, for e outros. Falaremos sobre os mesmos na Rust.

if ou else

A ramificação com if ou else é semelhante às outras linguagens. Ao contrário delas, a condição booleana não precisa ser envolvida por parênteses, e cada condição é seguida por um bloco. As condicionais if e else são expressões, e todos os ramos deve retornar o mesmo tipo.

fn main() {
    let n = 5;

    if n < 0 {
        print!("{} is negative", n);
    } else if n > 0 {
        print!("{} is positive", n);
    } else {
        print!("{} is zero", n);
    }

    let big_n =
        if n < 10 && n > -10 {
            println!(", and is a small number, increase ten-fold");

            // Esta expressão retorna uma `i32`.
            10 * n
        } else {
            println!(", and is a big number, halve the number");

            // Esta expressão deve retornar também uma `i32`.
            n / 2
            // TODO ^ Tente suprimir esta expressão com um ponto e vírgula.
        };
    //   ^ Não esqueça do ponto e vírgula! Todos vínculos de `let` precisam.

    println!("{} -> {}", n, big_n);
}

Laço de Repetição

A Rust fornece uma palavra-chave loop para indicar um laço infinito.

A declaração break pode ser usada para sair dum laço de repetição a qualquer momento, ao passo que a declaração continue pode ser usada para ignorar o resto da iteração e começar uma nova.

fn main() {
    let mut count = 0u32;

    println!("Let's count until infinity!");

    // Laço de repetição infinita
    loop {
        count += 1;

        if count == 3 {
            println!("three");

            // Ignorar o resto desta iteração
            continue;
        }

        println!("{}", count);

        if count == 5 {
            println!("OK, that's enough");

            // Sair deste laço de repetição
            break;
        }
    }
}

Encaixamentos e Rótulos

É possível rebentar (break) ou ignorar (continue) laços de repetição externos quando lidamos com laços de repetição encaixados. Nestes casos, os laços devem ser anotados com o mesmo rótulo ('label), e o rótulo deve ser passado à declaração break ou continue.

#![allow(unreachable_code, unused_labels)]

fn main() {
    'outer: loop {
        println!("Entered the outer loop");

        'inner: loop {
            println!("Entered the inner loop");

            // Isto apenas rebentaria o laço de repetição interno
            //break;

            // Isto rebenta o laço de repetição externo
            break 'outer;
        }

        println!("This point will never be reached");
    }

    println!("Exited the outer loop");
}

Regressar dos Laços de Repetições

Um dos usos dum loop é para voltar a tentar uma operação até ser bem-sucedida. Mas se a operação retornar um valor, podes precisar de passá-lo ao resto do código: colocá-lo depois da break, e será retornado pela expressão loop.

fn main() {
    let mut counter = 0;

    let result = loop {
        counter += 1;

        if counter == 10 {
            break counter * 2;
        }
    };

    assert_eq!(result, 20);
}

while

A palavra-chave while pode ser usada para executar um laço de repetição enquanto (while) uma condição for verdadeira.

Escreveremos o infame FizzBuzz usando um laço de repetição while.

fn main() {
    // Uma variável contadora
    let mut n = 1;

    // Laço de repetição enquanto `n` for menor do que 101 
    while n < 101 {
        if n % 15 == 0 {
            println!("fizzbuzz");
        } else if n % 3 == 0 {
            println!("fizz");
        } else if n % 5 == 0 {
            println!("buzz");
        } else {
            println!("{}", n);
        }

        // Incrementar contador
        n += 1;
    }
}

Laços de Repetição for

for e o limite

A construção for in pode ser usada para iterar através dum Iterator. Uma das maneiras mais fáceis de criar um iterador é usar a notação de limite a..b. Isto produz valores a partir de a (inclusivo) à b (exclusivo) em passos de um.

Vamos escrever FizzBuzz usando for ao invés de while.

fn main() {
    // `n` receberá os valores: 1, 2, ..., 100 em cada iteração
    for n in 1..101 {
        if n % 15 == 0 {
            println!("fizzbuzz");
        } else if n % 3 == 0 {
            println!("fizz");
        } else if n % 5 == 0 {
            println!("buzz");
        } else {
            println!("{}", n);
        }
    }
}

Alternativamente, a..=b pode ser usado para um limite que é inclusivo em ambos fins. O exemplo acima pode ser escrito como:

fn main() {
    // `n` receberá os valores: 1, 2, ..., 100 em cada iteração
    for n in 1..=100 {
        if n % 15 == 0 {
            println!("fizzbuzz");
        } else if n % 3 == 0 {
            println!("fizz");
        } else if n % 5 == 0 {
            println!("buzz");
        } else {
            println!("{}", n);
        }
    }
}

for e os iteradores

A construção for in é capaz de interagir com um Iterator de muitas maneiras. Conforme discutido na seção sobre a característica Iterator, por padrão o laço for aplicará a função into_iter à coleção. No entanto, isto não é o único significado de converter coleções em iteradores.

into_iter, iter e iter_mut manipulam a conversão duma coleção num iterador de maneiras diferentes, fornecendo visões diferentes sobre os dados no interior.

  • iter - Este pedi emprestado cada elemento da coleção durante cada iteração. Assim deixando a coleção intocada e disponível para reuso depois do laço.
fn main() {
    let names = vec!["Bob", "Frank", "Ferris"];

    for name in names.iter() {
        match name {
            &"Ferris" => println!("There is a rustacean among us!"),
            // TODO ^ Tente eliminar o & e corresponder apenas "Ferris"
            _ => println!("Hello {}", name),
        }
    }
    
    println!("names: {:?}", names);
}
  • into_iter - Este consume a coleção para que em cada iteração o dado exato ser fornecido. Assim que a coleção tiver sido consumida não estará disponível para reuso como se tivesse sido 'movida' dentro do laço.
fn main() {
    let names = vec!["Bob", "Frank", "Ferris"];

    for name in names.into_iter() {
        match name {
            "Ferris" => println!("There is a rustacean among us!"),
            _ => println!("Hello {}", name),
        }
    }
    
    println!("names: {:?}", names);
    // FIXME ^ Comente esta linha
}
  • iter_mut - Este pedi emprestado de maneira mutável cada elemento da coleção, permitindo a coleção ser modificada no lugar.
fn main() {
    let mut names = vec!["Bob", "Frank", "Ferris"];

    for name in names.iter_mut() {
        *name = match name {
            &mut "Ferris" => "There is a rustacean among us!",
            _ => "Hello",
        }
    }

    println!("names: {:?}", names);
}

Nos trechos acima nota o tipo do ramo match, que é a diferença chave nos tipos de iteração. A diferença no tipo portanto com certeza implica que diferentes ações que são capazes de ser realizadas.

Consulte também:

Iterator

match

A Rust fornece correspondência de padrão através da palavra-chave match, que pode ser usada como uma switch de C. O primeiro braço de correspondência é avaliado e todos os valores possíveis devem ser cobertos.

fn main() {
    let number = 13;
    // TODO ^ Tente valores diferentes para `number`

    println!("Tell me about {}", number);
    match number {
        // Corresponder um único valor
        1 => println!("One!"),
        // Corresponder vários valores
        2 | 3 | 5 | 7 | 11 => println!("This is a prime"),
        // TODO ^ Tente adicionar 13 à lista de valores primos
        // Corresponder um limite inclusivo
        13..=19 => println!("A teen"),
        // Lidar com o resto dos casos
        _ => println!("Ain't special"),
        // TODO ^ Tente comentar este braço de captura total
    }

    let boolean = true;
    // `match` também é uma expressão
    let binary = match boolean {
        // Os braços duma correspondência deve cobrir todos os valores possíveis
        false => 0,
        true => 1,
        // TODO ^ Tente comentar um destes braços
    };

    println!("{} -> {}", boolean, binary);
}

Desestruturação

Um bloco match pode desestruturar itens numa variedade de maneiras.

Tuplas

As tuplas podem ser desestruturas num match da seguinte maneira:

fn main() {
    let triple = (0, -2, 3);
    // TODO ^ Experimente valores diferentes para `triple`

    println!("Tell me about {:?}", triple);
    // A correspondência pode ser usada para desestruturar uma tupla
    match triple {
        // Desestruturar o segundo e o terceiro elemento
        (0, y, z) => println!("First is `0`, `y` is {:?}, and `z` is {:?}", y, z),
        (1, ..)  => println!("First is `1` and the rest doesn't matter"),
        (.., 2)  => println!("last is `2` and the rest doesn't matter"),
        (3, .., 4)  => println!("First is `3`, last is `4`, and the rest doesn't matter"),
        // `..` pode ser usado para ignorar o resto da tupla
        _      => println!("It doesn't matter what they are"),
        // `_` significa não vincular o valor à uma variável
    }
}

Consulte também:

Tuplas

Vetores ou Pedaços

Tal como as tuplas, os vetores e pedaços podem ser desestruturados desta maneira:

fn main() {
    // Experimente mudar os valores no vetor, ou torne-o num pedaço!
    let array = [1, -2, 6];

    match array {
        // Vincula o segundo e o terceiro elemento às respetivas variáveis
        [0, second, third] =>
            println!("array[0] = 0, array[1] = {}, array[2] = {}", second, third),

        // Os valores individuais podem ser ignorados com _
        [1, _, third] => println!(
            "array[0] = 1, array[2] = {} and array[1] was ignored",
            third
        ),

        // Nós também podemos vincular alguns e ignorar o resto
        [-1, second, ..] => println!(
            "array[0] = -1, array[1] = {} and all the other ones were ignored",
            second
        ),
        // O código abaixo não compilaria
        // [-1, second] => ...

        // Ou armazena-os noutro vetor ou pedaço (o tipo depende do
        // valor que está sendo comparado)
        [3, second, tail @ ..] => println!(
            "array[0] = 3, array[1] = {} and the other elements were {:?}",
            second, tail
        ),

        // Combinado estes padrões, podemos, por exemplo, vincular os
        // primeiros e últimos valores, e armazenar o resto deles num único vetor
        [first, middle @ .., last] => println!(
            "array[0] = {}, middle = {:?}, array[2] = {}",
            first, middle, last
        ),
    }
}

Consulte também:

Vetores e Pedaços e Vínculo pelo símbolo @

Enumerações

Uma enum é desestruturada de maneira semelhante:

// `allow` exigido para silenciar os avisos
// porque apenas uma variante é usada.
#[allow(dead_code)]
enum Color {
    // Estes 3 são especificados unicamente pelo seu nome.
    Red,
    Blue,
    Green,
    // Estes igualmente atam as tuplas de `u32` para
    // nomes diferentes: modelos de cor.
    RGB(u32, u32, u32),
    HSV(u32, u32, u32),
    HSL(u32, u32, u32),
    CMY(u32, u32, u32),
    CMYK(u32, u32, u32, u32),
}

fn main() {
    let color = Color::RGB(122, 17, 40);
    // TODO ^ Experimente diferentes variantes para `color`

    println!("What color is it?");
    // Uma `enum` pode ser desestruturada usando uma `match`.
    match color {
        Color::Red   => println!("The color is Red!"),
        Color::Blue  => println!("The color is Blue!"),
        Color::Green => println!("The color is Green!"),
        Color::RGB(r, g, b) =>
            println!("Red: {}, green: {}, and blue: {}!", r, g, b),
        Color::HSV(h, s, v) =>
            println!("Hue: {}, saturation: {}, value: {}!", h, s, v),
        Color::HSL(h, s, l) =>
            println!("Hue: {}, saturation: {}, lightness: {}!", h, s, l),
        Color::CMY(c, m, y) =>
            println!("Cyan: {}, magenta: {}, yellow: {}!", c, m, y),
        Color::CMYK(c, m, y, k) =>
            println!("Cyan: {}, magenta: {}, yellow: {}, key (black): {}!",
                c, m, y, k),
        // Não precisa doutro braço porque todas variantes foram examinadas
    }
}

Consulte também:

#[allow(...)], modelos de cor e enum

Ponteiros ou Referência

Para os ponteiros, uma distinção precisa de ser feita entre a desestruturação e a destruição de referência, uma vez que são conceitos diferentes que são usados de maneira diferente nas linguagens como C/C++.

  • A destruição de referência usa *
  • A desestruturação usa &, ref, e ref mut
fn main() {
    // Atribuir uma referência do tipo `i32`. O `&` significa que
    // existe uma referência a ser atribuída.
    let reference = &4;

    match reference {
        // Se `reference` for o padrão correspondente contra `&val`,
        // resulta numa comparação como:
        // `&i32`
        // `&val`
        // ^ Nós vemos que se os `&` correspondentes forem eliminados,
        // então deve ser atribuído à `val`.
        &val => println!("Got a value via destructuring: {:?}", val),
    }

    // Para evitar o `&`, destruímos a referência antes da correspondência.
    match *reference {
        val => println!("Got a value via dereferencing: {:?}", val),
    }

    // E se não começarmos com uma referência? `reference` era um `&`
    // porque o lado direito já era uma referência. Isto não é uma
    // referência porque o lado direito não é uma.
    let _not_a_reference = 3;

    // A Rust fornece `ref` para exatamente este propósito. Ela modifica a
    // atribuição para que uma referência seja criada para o elemento;
    // esta referência é atribuída.
    let ref _is_a_reference = 3;

    // Assim, ao definir 2 valores sem referência, as referências podem
    // ser recuperadas através de `ref` e `ref mut`.
    let value = 5;
    let mut mut_value = 6;

    // Usar a palavra-chave `ref` para criar uma referência
    match value {
        ref r => println!("Got a reference to a value: {:?}", r),
    }

    // Usar `ref mut` de maneira semelhante.
    match mut_value {
        ref mut m => {
            // Temos uma referência. Temos que destruir a sua referência
            // antes que possamos adicionar qualquer coisa à ela.
            *m += 10;
            println!("We added 10. `mut_value`: {:?}", m);
        },
    }
}

Consulte também:

O padrão de referência

Estruturas

De maneira semelhante, uma struct pode ser desestruturada como mostrada:

fn main() {
    struct Foo {
        x: (u32, u32),
        y: u32,
    }

    // Experimente mudar os valores na estrutura para veres o que acontece
    let foo = Foo { x: (1, 2), y: 3 };

    match foo {
        Foo { x: (1, b), y } => println!("First of x is 1, b = {},  y = {} ", b, y),

        // nós podemos desestruturar as estruturas e renomear as variáveis,
        // a ordem não é importante
        Foo { y: 2, x: i } => println!("y is 2, i = {:?}", i),

        // e também podemos ignorar algumas variáveis:
        Foo { y, .. } => println!("y = {}, we don't care about x", y),
        // isto dará um erro: o padrão não menciona o campo `x`
        //Foo { y } => println!("y = {}", y),
    }

    let faa = Foo { x: (1, 2), y: 3 };

    // Nós não precisamos dum bloco de correspondência para
    // desestruturar as estruturas:
    let Foo { x : x0, y: y0 } = faa;
    println!("Outside: x0 = {x0:?}, y0 = {y0}");
}

Consulte também:

Estruturas

Guardas

Uma guarda de match pode ser adicionada para filtrar o braço.

#[allow(dead_code)]
enum Temperature {
    Celsius(i32),
    Fahrenheit(i32),
}

fn main() {
    let temperature = Temperature::Celsius(35);
    // ^ TODO experimento diferentes valores para `temperature`

    match temperature {
        Temperature::Celsius(t) if t > 30 => println!("{}C is above 30 Celsius", t),
        // A parte da condição if ^ é uma guarda
        Temperature::Celsius(t) => println!("{}C is below 30 Celsius", t),

        Temperature::Fahrenheit(t) if t > 86 => println!("{}F is above 86 Fahrenheit", t),
        Temperature::Fahrenheit(t) => println!("{}F is below 86 Fahrenheit", t),
    }
}

Nota que o compilador não levará as condições da guarda em consideração quando verificamos se todos os padrões estão cobertos pela expressão de correspondência.

fn main() {
    let number: u8 = 4;

    match number {
        i if i == 0 => println!("Zero"),
        i if i > 0 => println!("Greater than zero"),
        // _ => unreachable!("Should never happen."),
        // TODO ^ remova o comentário para corrigir a compilação
    }
}

Consulte Também:

Tuplas Enumerações

Vínculo

Acessar indiretamente uma variável torna impossível ramificar e usar esta variável sem re-vincular. match fornece o símbolo @ para vincular os valores aos nomes:

// Uma função `age` que retorna uma `u32`.
fn age() -> u32 {
    15
}

fn main() {
    println!("Tell me what type of person you are");

    match age() {
        0             => println!("I haven't celebrated my first birthday yet"),
        // Poderia `match` 1 ..= 12 diretamente mas então de qual
        // idade seria a criança? Ao invés disto, vinculamos ao `n`
        // para a sequência de 1 ..= 12. Agora a idade pode ser reportada.
        n @ 1  ..= 12 => println!("I'm a child of age {:?}", n),
        n @ 13 ..= 19 => println!("I'm a teen of age {:?}", n),
        // Nada vinculado. Retorna o resultado.
        n             => println!("I'm an old person of age {:?}", n),
    }
}

You can also use binding to "destructure" enum variants, such as Option: Nós também podemos usar o vínculo para "desestruturar" as variantes de enum, tais como Option:

fn some_number() -> Option<u32> {
    Some(42)
}

fn main() {
    match some_number() {
        // Recebeu a variante `Some`, corresponde se seu valor,
        // vinculado à `n`, é igual à 42.
        Some(n @ 42) => println!("The Answer: {}!", n),
        // Corresponde qualquer outro número.
        Some(n)      => println!("Not interesting... {}", n),
        // Corresponde mais alguma coisa (variante `None`).
        _            => (),
    }
}

Consulte Também:

functions, enums e Option

if let

Para alguns casos de uso, quando correspondemos enumerações, match é difícil. Por exemplo:

#![allow(unused)]
fn main() {
// Tornar `optional` do tipo `Option<i32>`
let optional = Some(7);

match optional {
    Some(i) => {
        println!("This is a really long string and `{:?}`", i);
        // ^ Precisava de 2 indentações só assim poderíamos 
        // desestruturar `i` a partir da opção.
    },
    _ => {},
    // ^ Obrigatório porque `match` é exaustiva.
    // Não parece como espaço desperdiçado?
};

}

if let é mais clara para este caso de uso e além disso permite várias opções de fracasso a serem especificadas:

fn main() {
    // Todos têm o tipo `Option<i32>`
    let number = Some(7);
    let letter: Option<i32> = None;
    let emoticon: Option<i32> = None;

    // A construção `if let` lê-se: "se `let` desestruturar `number` na"
    // `Some(i)`, avaliar o bloco (`{}`).
    if let Some(i) = number {
        println!("Matched {:?}!", i);
    }

    // Se precisarmos de especificar uma falha, usamos uma `else`:
    if let Some(i) = letter {
        println!("Matched {:?}!", i);
    } else {
        // Desestruturação falhada. Mudar para o caso de falha.
        println!("Didn't match a number. Let's go with a letter!");
    }

    // Fornecer uma condição de falha alterada.
    let i_like_letters = false;

    if let Some(i) = emoticon {
        println!("Matched {:?}!", i);
    // Desestruturação falhada. Avaliar uma condição `else if` para
    // ver se o ramo de falha alternada deveria ser considerado:
    } else if i_like_letters {
        println!("Didn't match a number. Let's go with a letter!");
    } else {
        // A condição avaliou `false`. Este ramo é o padrão:
        println!("I don't like letters. Let's go with an emoticon :)!");
    }
}

De algum modo, if let pode ser usada para corresponder apenas o valor de enumeração:

// Nossa enumeração de exemplo
enum Foo {
    Bar,
    Baz,
    Qux(u32)
}

fn main() {
    // Criar variáveis de exemplo
    let a = Foo::Bar;
    let b = Foo::Baz;
    let c = Foo::Qux(100);
    
    // Variável `a` corresponde `Foo::Bar`
    if let Foo::Bar = a {
        println!("a is foobar");
    }
    
    // Variável `b` não corresponde `Foo::Bar`
    // Então isto não imprimirá nada
    if let Foo::Bar = b {
        println!("b is foobar");
    }
    
    // Variável `c` corresponde `Foo::Qux` que tem um valor
    // Semelhante à `Some()` no exemplo anterior
    if let Foo::Qux(value) = c {
        println!("c is {}", value);
    }

    // Vínculos também funcionam com `if let`
    if let Foo::Qux(value @ 100) = c {
        println!("c is one hundred");
    }
}

Um outro beneficio é que if let permite-nos corresponder variantes de enumeração não parametrizadas. Isto é verdadeiro mesmo em casos onde a enumeração não implementa ou deriva PartialEq. Em tais casos if Foo::Bar == a falharia ao compilar, porque as instâncias da enumeração não podem ser equiparadas, no entanto if let continuará a funcionar.

Gostarias dum desafio? Corrija o seguinte exemplo para usar if let:

// Esta enumeração propositadamente não implemente nem deriva `PartialEq`.
// É por isto que a comparação abaixo `Foo::Bar == a` falha.
enum Foo {Bar}

fn main() {
    let a = Foo::Bar;

    // Variável `a` corresponde `Foo::Bar`
    if Foo::Bar == a {
    // ^-- isto causa um erro de tempo de compilação. Use `if let`.
        println!("a is foobar");
    }
}

Consulte também:

enum, Option, e o RFC

let-else

🛈 estável desde a: rust 1.65

Com let-else, um padrão refutável pode corresponder e vincular variáveis no âmbito envolvente como um let normal, se não diverge (por exemplo, break, return, panic!) quando o padrão não corresponder:

#![allow(unused)]
fn main() {
use std::str::FromStr;

fn get_count_item(s: &str) -> (u64, &str) {
    let mut it = s.split(' ');
    let (Some(count_str), Some(item)) = (it.next(), it.next()) else {
        panic!("Can't segment count item pair: '{s}'");
    };
    let Ok(count) = u64::from_str(count_str) else {
        panic!("Can't parse integer: '{count_str}'");
    };
    (count, item)
}

assert_eq!(get_count_item("3 chairs"), (3, "chairs"));
}

O âmbito de vínculos de nome é a coisa principal que torna isto diferente de match ou expressões if let-else. Nós poderíamos anteriormente aproximar estes padrões com um lamentável bit de repetição e um let externo:

#![allow(unused)]
fn main() {
use std::str::FromStr;

fn get_count_item(s: &str) -> (u64, &str) {
    let mut it = s.split(' ');
    let (count_str, item) = match (it.next(), it.next()) {
        (Some(count_str), Some(item)) => (count_str, item),
        _ => panic!("Can't segment count item pair: '{s}'"),
    };
    let count = if let Ok(count) = u64::from_str(count_str) {
        count
    } else {
        panic!("Can't parse integer: '{count_str}'");
    };
    (count, item)
}

assert_eq!(get_count_item("3 chairs"), (3, "chairs"));
}

Consulte também:

option, match, if let e o RFC de let-else.

while let

Semelhante ao if let, while let pode tornar sequências de match embaraçosas mais toleráveis. Considere a seguinte sequência que incrementa i:

#![allow(unused)]
fn main() {
// Tornar `optional` do tipo `Option<i32>`
let mut optional = Some(0);

// Tentar repetidamente este teste.
loop {
    match optional {
        // Se `optional` desestruturar, avalia o bloco.
        Some(i) => {
            if i > 9 {
                println!("Greater than 9, quit!");
                optional = None;
            } else {
                println!("`i` is `{:?}`. Try again.", i);
                optional = Some(i + 1);
            }
            // ^ Exige 3 indentações!
        },
        // Sair do laço de repetição quando a desestruturação falhar:
        _ => { break; }
        // ^ Porquê que isto deveria ser obrigatório? Deve existir um jeito melhor!
    }
}
}

O uso de while let torna esta sequência muito mais agradável:

fn main() {
    // Tornar `optional` do tipo `Option<i32>`
    let mut optional = Some(0);

    // Isto lê-se: "enquanto `let` desestruturar `optional`" à
    // `Some(i)`, avaliar o bloco (`{}`). Se não `break`.
    while let Some(i) = optional {
        if i > 9 {
            println!("Greater than 9, quit!");
            optional = None;
        } else {
            println!("`i` is `{:?}`. Try again.", i);
            optional = Some(i + 1);
        }
        // ^ Menos desvio à direita e não exige
        // manipular explicitamente o caso de falha.
    }
    // ^ `if let` tinha cláusulas `else` ou `else if` opcionais adicionais
    // `while let` não tem estas.
}

Consulte também:

enum, Option, e o RFC

Funções

As funções são declaras usando a palavra-chave fn. Os seus argumentos têm os tipos anotados, tal como as variáveis, e se a função retornar um valor, o tipo de retorno de ser especificado depois duma flecha (ou seta) ->.

A expressão final numa função será usada como valor de retorno. Alternativamente, a declaração de return pode ser usada para retornar um valor mais cedo a partir de dentro da função, até mesmo a partir de dentro dos laços de repetição ou declarações de if.

Vamos reescrever o FizzBuzz usando funções!:

// Diferente de C e C++,
// não existe restrição sobre a ordem de definições de função
fn main() {
    // Nós podemos usar esta função,
    // e defini-la em algum lugar depois
    fizzbuzz_to(100);
}

// Função que retorna um valor booleano
fn is_divisible_by(lhs: u32, rhs: u32) -> bool {
    // Caso extremo, retornar prematuramente
    if rhs == 0 {
        return false;
    }

    // Isto é uma expressão,
    // a palavra-chave `return` não é necessária
    lhs % rhs == 0
}

// As funções que "não" retornam um valor,
// na realidade retornam o tipo unitário `()`
fn fizzbuzz(n: u32) -> () {
    if is_divisible_by(n, 15) {
        println!("fizzbuzz");
    } else if is_divisible_by(n, 3) {
        println!("fizz");
    } else if is_divisible_by(n, 5) {
        println!("buzz");
    } else {
        println!("{}", n);
    }
}

// Quando uma função retornar `()`,
// o tipo de retorno pode ser omitido da assinatura
fn fizzbuzz_to(n: u32) {
    for n in 1..=n {
        fizzbuzz(n);
    }
}

Funções e Métodos Associados

Algumas funções são conectadas à um topo específico. Existem em duas formas: funções associadas, e métodos. As funções associadas são funções que são definidas geralmente sobre um tipo, enquanto os métodos são funções associadas que são chamados sobre uma instância específica dum tipo:

struct Point {
    x: f64,
    y: f64,
}

// Bloco de implementação, todas funções associadas do `Point`
// e os métodos são colocados neste bloco
impl Point {
    // Isto é uma "função associada" pois esta função está associada à
    // um tipo específico, que é, `Point`.
    //
    // Funções associadas não precisam ser chamadas com uma instância.
    // Estas funções são geralmente usadas como construtoras.
    fn origin() -> Point {
        Point { x: 0.0, y: 0.0 }
    }

    // Uma outra função associada, recebendo dois argumentos:
    fn new(x: f64, y: f64) -> Point {
        Point { x: x, y: y }
    }
}

struct Rectangle {
    p1: Point,
    p2: Point,
}

impl Rectangle {
    // Isto é um método
    // `&self` é açúcar sintático para `self: &Self`,
    // aonde `Self` é o tipo do objeto chamador.
    // Neste caso `Self` = `Rectangle`
    fn area(&self) -> f64 {
        // `self` dá acesso aos campos da estrutura através do
        // operador ponto
        let Point { x: x1, y: y1 } = self.p1;
        let Point { x: x2, y: y2 } = self.p2;

        // `abs` é um método de `f64` que retorna o
        // valor absoluto do chamador
        ((x1 - x2) * (y1 - y2)).abs()
    }

    fn perimeter(&self) -> f64 {
        let Point { x: x1, y: y1 } = self.p1;
        let Point { x: x2, y: y2 } = self.p2;

        2.0 * ((x1 - x2).abs() + (y1 - y2).abs())
    }

    // Este método exige que objeto chamador seja mutável.
    // `&mut self` é açúcar para `self: &mut Self`
    fn translate(&mut self, x: f64, y: f64) {
        self.p1.x += x;
        self.p2.x += x;

        self.p1.y += y;
        self.p2.y += y;
    }
}

// `Pair` possui os recursos: dois amontoados alocados de inteiros
struct Pair(Box<i32>, Box<i32>);

impl Pair {
    // Este método "consome" os recursos do objeto chamador
    // `self` é açúcar para `self: Self`
    fn destroy(self) {
        // Desestruturar `self`
        let Pair(first, second) = self;

        println!("Destroying Pair({}, {})", first, second);

        // `first` e `second` saem do âmbito e são liberados
    }
}

fn main() {
    let rectangle = Rectangle {
        // As funções associadas são chamadas usando dois-pontos duplos
        p1: Point::origin(),
        p2: Point::new(3.0, 4.0),
    };

    // Os métodos são chamados usando o operador ponto
    // Nota que o primeiro argumento `&self` é implicitamente passado, isto é,
    // `rectangle.perimeter()` === `Rectangle::perimeter(&rectangle)`
    println!("Rectangle perimeter: {}", rectangle.perimeter());
    println!("Rectangle area: {}", rectangle.area());

    let mut square = Rectangle {
        p1: Point::origin(),
        p2: Point::new(1.0, 1.0),
    };

    // Error! `rectangle` é imutável, mas este método exige um objeto mutável
    //rectangle.translate(1.0, 0.0);
    // TODO ^ Tente desfazer o comentário desta linha

    // Okay! Os objetos mutáveis podem chamar métodos mutáveis
    square.translate(1.0, 1.0);

    let pair = Pair(Box::new(1), Box::new(2));

    pair.destroy();

    // Error! A chama de `destroy` anterior "consumiu" `pair`
    //pair.destroy();
    // TODO ^ Tente desfazer o comentário desta linha
}

Closures

Closures are functions that can capture the enclosing environment. For example, a closure that captures the x variable:

|val| val + x

The syntax and capabilities of closures make them very convenient for on the fly usage. Calling a closure is exactly like calling a function. However, both input and return types can be inferred and input variable names must be specified.

Other characteristics of closures include:

  • using || instead of () around input variables.
  • optional body delimination ({}) for a single expression (mandatory otherwise).
  • the ability to capture the outer environment variables.
fn main() {
    let outer_var = 42;
    
    // A regular function can't refer to variables in the enclosing environment
    //fn function(i: i32) -> i32 { i + outer_var }
    // TODO: uncomment the line above and see the compiler error. The compiler
    // suggests that we define a closure instead.

    // Closures are anonymous, here we are binding them to references
    // Annotation is identical to function annotation but is optional
    // as are the `{}` wrapping the body. These nameless functions
    // are assigned to appropriately named variables.
    let closure_annotated = |i: i32| -> i32 { i + outer_var };
    let closure_inferred  = |i     |          i + outer_var  ;

    // Call the closures.
    println!("closure_annotated: {}", closure_annotated(1));
    println!("closure_inferred: {}", closure_inferred(1));
    // Once closure's type has been inferred, it cannot be inferred again with another type.
    //println!("cannot reuse closure_inferred with another type: {}", closure_inferred(42i64));
    // TODO: uncomment the line above and see the compiler error.

    // A closure taking no arguments which returns an `i32`.
    // The return type is inferred.
    let one = || 1;
    println!("closure returning one: {}", one());

}

Capturing

Closures are inherently flexible and will do what the functionality requires to make the closure work without annotation. This allows capturing to flexibly adapt to the use case, sometimes moving and sometimes borrowing. Closures can capture variables:

  • by reference: &T
  • by mutable reference: &mut T
  • by value: T

They preferentially capture variables by reference and only go lower when required.

fn main() {
    use std::mem;
    
    let color = String::from("green");

    // A closure to print `color` which immediately borrows (`&`) `color` and
    // stores the borrow and closure in the `print` variable. It will remain
    // borrowed until `print` is used the last time. 
    //
    // `println!` only requires arguments by immutable reference so it doesn't
    // impose anything more restrictive.
    let print = || println!("`color`: {}", color);

    // Call the closure using the borrow.
    print();

    // `color` can be borrowed immutably again, because the closure only holds
    // an immutable reference to `color`. 
    let _reborrow = &color;
    print();

    // A move or reborrow is allowed after the final use of `print`
    let _color_moved = color;


    let mut count = 0;
    // A closure to increment `count` could take either `&mut count` or `count`
    // but `&mut count` is less restrictive so it takes that. Immediately
    // borrows `count`.
    //
    // A `mut` is required on `inc` because a `&mut` is stored inside. Thus,
    // calling the closure mutates the closure which requires a `mut`.
    let mut inc = || {
        count += 1;
        println!("`count`: {}", count);
    };

    // Call the closure using a mutable borrow.
    inc();

    // The closure still mutably borrows `count` because it is called later.
    // An attempt to reborrow will lead to an error.
    // let _reborrow = &count; 
    // ^ TODO: try uncommenting this line.
    inc();

    // The closure no longer needs to borrow `&mut count`. Therefore, it is
    // possible to reborrow without an error
    let _count_reborrowed = &mut count; 

    
    // A non-copy type.
    let movable = Box::new(3);

    // `mem::drop` requires `T` so this must take by value. A copy type
    // would copy into the closure leaving the original untouched.
    // A non-copy must move and so `movable` immediately moves into
    // the closure.
    let consume = || {
        println!("`movable`: {:?}", movable);
        mem::drop(movable);
    };

    // `consume` consumes the variable so this can only be called once.
    consume();
    // consume();
    // ^ TODO: Try uncommenting this line.
}

Using move before vertical pipes forces closure to take ownership of captured variables:

fn main() {
    // `Vec` has non-copy semantics.
    let haystack = vec![1, 2, 3];

    let contains = move |needle| haystack.contains(needle);

    println!("{}", contains(&1));
    println!("{}", contains(&4));

    // println!("There're {} elements in vec", haystack.len());
    // ^ Uncommenting above line will result in compile-time error
    // because borrow checker doesn't allow re-using variable after it
    // has been moved.
    
    // Removing `move` from closure's signature will cause closure
    // to borrow _haystack_ variable immutably, hence _haystack_ is still
    // available and uncommenting above line will not cause an error.
}

See also:

Box and std::mem::drop

As input parameters

While Rust chooses how to capture variables on the fly mostly without type annotation, this ambiguity is not allowed when writing functions. When taking a closure as an input parameter, the closure's complete type must be annotated using one of a few traits, and they're determined by what the closure does with captured value. In order of decreasing restriction, they are:

  • Fn: the closure uses the captured value by reference (&T)
  • FnMut: the closure uses the captured value by mutable reference (&mut T)
  • FnOnce: the closure uses the captured value by value (T)

On a variable-by-variable basis, the compiler will capture variables in the least restrictive manner possible.

For instance, consider a parameter annotated as FnOnce. This specifies that the closure may capture by &T, &mut T, or T, but the compiler will ultimately choose based on how the captured variables are used in the closure.

This is because if a move is possible, then any type of borrow should also be possible. Note that the reverse is not true. If the parameter is annotated as Fn, then capturing variables by &mut T or T are not allowed. However, &T is allowed.

In the following example, try swapping the usage of Fn, FnMut, and FnOnce to see what happens:

// A function which takes a closure as an argument and calls it.
// <F> denotes that F is a "Generic type parameter"
fn apply<F>(f: F) where
    // The closure takes no input and returns nothing.
    F: FnOnce() {
    // ^ TODO: Try changing this to `Fn` or `FnMut`.

    f();
}

// A function which takes a closure and returns an `i32`.
fn apply_to_3<F>(f: F) -> i32 where
    // The closure takes an `i32` and returns an `i32`.
    F: Fn(i32) -> i32 {

    f(3)
}

fn main() {
    use std::mem;

    let greeting = "hello";
    // A non-copy type.
    // `to_owned` creates owned data from borrowed one
    let mut farewell = "goodbye".to_owned();

    // Capture 2 variables: `greeting` by reference and
    // `farewell` by value.
    let diary = || {
        // `greeting` is by reference: requires `Fn`.
        println!("I said {}.", greeting);

        // Mutation forces `farewell` to be captured by
        // mutable reference. Now requires `FnMut`.
        farewell.push_str("!!!");
        println!("Then I screamed {}.", farewell);
        println!("Now I can sleep. zzzzz");

        // Manually calling drop forces `farewell` to
        // be captured by value. Now requires `FnOnce`.
        mem::drop(farewell);
    };

    // Call the function which applies the closure.
    apply(diary);

    // `double` satisfies `apply_to_3`'s trait bound
    let double = |x| 2 * x;

    println!("3 doubled: {}", apply_to_3(double));
}

See also:

std::mem::drop, Fn, FnMut, Generics, where and FnOnce

Type anonymity

Closures succinctly capture variables from enclosing scopes. Does this have any consequences? It surely does. Observe how using a closure as a function parameter requires generics, which is necessary because of how they are defined:

#![allow(unused)]
fn main() {
// `F` must be generic.
fn apply<F>(f: F) where
    F: FnOnce() {
    f();
}
}

When a closure is defined, the compiler implicitly creates a new anonymous structure to store the captured variables inside, meanwhile implementing the functionality via one of the traits: Fn, FnMut, or FnOnce for this unknown type. This type is assigned to the variable which is stored until calling.

Since this new type is of unknown type, any usage in a function will require generics. However, an unbounded type parameter <T> would still be ambiguous and not be allowed. Thus, bounding by one of the traits: Fn, FnMut, or FnOnce (which it implements) is sufficient to specify its type.

// `F` must implement `Fn` for a closure which takes no
// inputs and returns nothing - exactly what is required
// for `print`.
fn apply<F>(f: F) where
    F: Fn() {
    f();
}

fn main() {
    let x = 7;

    // Capture `x` into an anonymous type and implement
    // `Fn` for it. Store it in `print`.
    let print = || println!("{}", x);

    apply(print);
}

See also:

A thorough analysis, Fn, FnMut, and FnOnce

Input functions

Since closures may be used as arguments, you might wonder if the same can be said about functions. And indeed they can! If you declare a function that takes a closure as parameter, then any function that satisfies the trait bound of that closure can be passed as a parameter.

// Define a function which takes a generic `F` argument
// bounded by `Fn`, and calls it
fn call_me<F: Fn()>(f: F) {
    f();
}

// Define a wrapper function satisfying the `Fn` bound
fn function() {
    println!("I'm a function!");
}

fn main() {
    // Define a closure satisfying the `Fn` bound
    let closure = || println!("I'm a closure!");

    call_me(closure);
    call_me(function);
}

As an additional note, the Fn, FnMut, and FnOnce traits dictate how a closure captures variables from the enclosing scope.

See also:

Fn, FnMut, and FnOnce

As output parameters

Closures as input parameters are possible, so returning closures as output parameters should also be possible. However, anonymous closure types are, by definition, unknown, so we have to use impl Trait to return them.

The valid traits for returning a closure are:

  • Fn
  • FnMut
  • FnOnce

Beyond this, the move keyword must be used, which signals that all captures occur by value. This is required because any captures by reference would be dropped as soon as the function exited, leaving invalid references in the closure.

fn create_fn() -> impl Fn() {
    let text = "Fn".to_owned();

    move || println!("This is a: {}", text)
}

fn create_fnmut() -> impl FnMut() {
    let text = "FnMut".to_owned();

    move || println!("This is a: {}", text)
}

fn create_fnonce() -> impl FnOnce() {
    let text = "FnOnce".to_owned();

    move || println!("This is a: {}", text)
}

fn main() {
    let fn_plain = create_fn();
    let mut fn_mut = create_fnmut();
    let fn_once = create_fnonce();

    fn_plain();
    fn_mut();
    fn_once();
}

See also:

Fn, FnMut, Generics and impl Trait.

Examples in std

This section contains a few examples of using closures from the std library.

Iterator::any

Iterator::any is a function which when passed an iterator, will return true if any element satisfies the predicate. Otherwise false. Its signature:

pub trait Iterator {
    // The type being iterated over.
    type Item;

    // `any` takes `&mut self` meaning the caller may be borrowed
    // and modified, but not consumed.
    fn any<F>(&mut self, f: F) -> bool where
        // `FnMut` meaning any captured variable may at most be
        // modified, not consumed. `Self::Item` states it takes
        // arguments to the closure by value.
        F: FnMut(Self::Item) -> bool;
}
fn main() {
    let vec1 = vec![1, 2, 3];
    let vec2 = vec![4, 5, 6];

    // `iter()` for vecs yields `&i32`. Destructure to `i32`.
    println!("2 in vec1: {}", vec1.iter()     .any(|&x| x == 2));
    // `into_iter()` for vecs yields `i32`. No destructuring required.
    println!("2 in vec2: {}", vec2.into_iter().any(|x| x == 2));

    // `iter()` only borrows `vec1` and its elements, so they can be used again
    println!("vec1 len: {}", vec1.len());
    println!("First element of vec1 is: {}", vec1[0]);
    // `into_iter()` does move `vec2` and its elements, so they cannot be used again
    // println!("First element of vec2 is: {}", vec2[0]);
    // println!("vec2 len: {}", vec2.len());
    // TODO: uncomment two lines above and see compiler errors.

    let array1 = [1, 2, 3];
    let array2 = [4, 5, 6];

    // `iter()` for arrays yields `&i32`.
    println!("2 in array1: {}", array1.iter()     .any(|&x| x == 2));
    // `into_iter()` for arrays yields `i32`.
    println!("2 in array2: {}", array2.into_iter().any(|x| x == 2));
}

See also:

std::iter::Iterator::any

Searching through iterators

Iterator::find is a function which iterates over an iterator and searches for the first value which satisfies some condition. If none of the values satisfy the condition, it returns None. Its signature:

pub trait Iterator {
    // The type being iterated over.
    type Item;

    // `find` takes `&mut self` meaning the caller may be borrowed
    // and modified, but not consumed.
    fn find<P>(&mut self, predicate: P) -> Option<Self::Item> where
        // `FnMut` meaning any captured variable may at most be
        // modified, not consumed. `&Self::Item` states it takes
        // arguments to the closure by reference.
        P: FnMut(&Self::Item) -> bool;
}
fn main() {
    let vec1 = vec![1, 2, 3];
    let vec2 = vec![4, 5, 6];

    // `iter()` for vecs yields `&i32`.
    let mut iter = vec1.iter();
    // `into_iter()` for vecs yields `i32`.
    let mut into_iter = vec2.into_iter();

    // `iter()` for vecs yields `&i32`, and we want to reference one of its
    // items, so we have to destructure `&&i32` to `i32`
    println!("Find 2 in vec1: {:?}", iter     .find(|&&x| x == 2));
    // `into_iter()` for vecs yields `i32`, and we want to reference one of
    // its items, so we have to destructure `&i32` to `i32`
    println!("Find 2 in vec2: {:?}", into_iter.find(| &x| x == 2));

    let array1 = [1, 2, 3];
    let array2 = [4, 5, 6];

    // `iter()` for arrays yields `&i32`
    println!("Find 2 in array1: {:?}", array1.iter()     .find(|&&x| x == 2));
    // `into_iter()` for arrays yields `i32`
    println!("Find 2 in array2: {:?}", array2.into_iter().find(|&x| x == 2));
}

Iterator::find gives you a reference to the item. But if you want the index of the item, use Iterator::position.

fn main() {
    let vec = vec![1, 9, 3, 3, 13, 2];

    // `iter()` for vecs yields `&i32` and `position()` does not take a reference, so
    // we have to destructure `&i32` to `i32`
    let index_of_first_even_number = vec.iter().position(|&x| x % 2 == 0);
    assert_eq!(index_of_first_even_number, Some(5));
    
    // `into_iter()` for vecs yields `i32` and `position()` does not take a reference, so
    // we do not have to destructure    
    let index_of_first_negative_number = vec.into_iter().position(|x| x < 0);
    assert_eq!(index_of_first_negative_number, None);
}

See also:

std::iter::Iterator::find

std::iter::Iterator::find_map

std::iter::Iterator::position

std::iter::Iterator::rposition

Funções de Ordem Superior

A Rust fornece Funções de Ordem Superior (HOF, sigla em Inglês). Estas são funções que recebem um ou mais funções e ou produzem uma função mais útil. As funções de ordem superior e iteradores preguiçosos dão à Rust o seu sabor funcional:

fn is_odd(n: u32) -> bool {
    n % 2 == 1
}

fn main() {
    println!("Find the sum of all the squared odd numbers under 1000");
    let upper = 1000;

    // Abordagem imperativa
    // Declarar variável acumuladora
    let mut acc = 0;
    // Iterar: 0, 1, 2, ... à infinito
    for n in 0.. {
        // Fazer a quadratura do número
        let n_squared = n * n;

        if n_squared >= upper {
            // Quebrar o laço se excedido o limite superior
            break;
        } else if is_odd(n_squared) {
            // Acumular o valor, se for ímpar
            acc += n_squared;
        }
    }
    println!("imperative style: {}", acc);

    // Abordagem funcional
    let sum_of_squared_odd_numbers: u32 =
        (0..).map(|n| n * n)                             // Todos os números naturais ao quadrado
             .take_while(|&n_squared| n_squared < upper) // Abaixo do limite superior
             .filter(|&n_squared| is_odd(n_squared))     // Que são ímpares
             .sum();                                     // Some-os
    println!("functional style: {}", sum_of_squared_odd_numbers);
}

Option e Iterator implementam a sua parte clara de funções de ordem superior.

Funções Divergentes

As funções divergentes nunca retornam. Elas são marcadas usando !, o qual é um tipo vazio.

#![allow(unused)]
fn main() {
fn foo() -> ! {
    panic!("This call never returns.");
}
}

Em oposição à todos os outros tipos, esta não pode ser instanciada, porque o conjunto de todos os possíveis valores que este tipo pode ter é vazio. Nota que, é diferente do tipo (), que tem exatamente um valor possível.

Por exemplo, esta função retorna de costume, embora não exista informação no valor de retorno:

fn some_fn() {
    ()
}

fn main() {
    let _a: () = some_fn();
    println!("This function returns and you can see this line.");
}

Em posição à esta função, que nunca retornará o controlo de volta ao chamador.

#![feature(never_type)]

fn main() {
    let x: ! = panic!("This call never returns.");
    println!("You will never see this line!");
}

Embora isto possa parecer como um conceito abstrato, é de fato muito útil e muitas vezes prático. A principal vantagem deste tipo é que pode ser fundido à qualquer outro e, portanto usado em lugares onde um tipo exato é obrigatório, por exemplo nos ramos de match. Isto permite-nos escrever código como este:

fn main() {
    fn sum_odd_numbers(up_to: u32) -> u32 {
        let mut acc = 0;
        for i in 0..up_to {
            // Repara que o tipo de retorno desta expressão de correspondência
            // de ser u32 por causa do tipo da variável "addition".
            let addition: u32 = match i%2 == 1 {
                // A variável "i" é do tipo u32, que está perfeitamente correta.
                true => i,
                // Por outro lado, a expressão "continue" não retorna u32,
                // mas ainda está bem, porque nunca retorna e portanto não viola os
                // requisitos da expressão de correspondência.
                false => continue,
            };
            acc += addition;
        }
        acc
    }
    println!("Sum of odd numbers up to 9 (excluding): {}", sum_odd_numbers(9));
}

Ela também é o tipo de retorno das função que percorrem os laços de repetição para sempre (por exemplo, loop {}) tal como servidores de rede ou funções que terminam o processo (por exemplo, exit()).

Módulos

A Rust fornece um sistema de módulo poderoso que pode ser usado para separar o código hierarquicamente em unidades lógicas (módulos), e gerir a visibilidade (público ou privado) entre eles.

Um módulo é uma coleção de itens: funções, estruturas, características, blocos de impl, e até mesmo outros módulos.

Visibility

By default, the items in a module have private visibility, but this can be overridden with the pub modifier. Only the public items of a module can be accessed from outside the module scope.

// A module named `my_mod`
mod my_mod {
    // Items in modules default to private visibility.
    fn private_function() {
        println!("called `my_mod::private_function()`");
    }

    // Use the `pub` modifier to override default visibility.
    pub fn function() {
        println!("called `my_mod::function()`");
    }

    // Items can access other items in the same module,
    // even when private.
    pub fn indirect_access() {
        print!("called `my_mod::indirect_access()`, that\n> ");
        private_function();
    }

    // Modules can also be nested
    pub mod nested {
        pub fn function() {
            println!("called `my_mod::nested::function()`");
        }

        #[allow(dead_code)]
        fn private_function() {
            println!("called `my_mod::nested::private_function()`");
        }

        // Functions declared using `pub(in path)` syntax are only visible
        // within the given path. `path` must be a parent or ancestor module
        pub(in crate::my_mod) fn public_function_in_my_mod() {
            print!("called `my_mod::nested::public_function_in_my_mod()`, that\n> ");
            public_function_in_nested();
        }

        // Functions declared using `pub(self)` syntax are only visible within
        // the current module, which is the same as leaving them private
        pub(self) fn public_function_in_nested() {
            println!("called `my_mod::nested::public_function_in_nested()`");
        }

        // Functions declared using `pub(super)` syntax are only visible within
        // the parent module
        pub(super) fn public_function_in_super_mod() {
            println!("called `my_mod::nested::public_function_in_super_mod()`");
        }
    }

    pub fn call_public_function_in_my_mod() {
        print!("called `my_mod::call_public_function_in_my_mod()`, that\n> ");
        nested::public_function_in_my_mod();
        print!("> ");
        nested::public_function_in_super_mod();
    }

    // pub(crate) makes functions visible only within the current crate
    pub(crate) fn public_function_in_crate() {
        println!("called `my_mod::public_function_in_crate()`");
    }

    // Nested modules follow the same rules for visibility
    mod private_nested {
        #[allow(dead_code)]
        pub fn function() {
            println!("called `my_mod::private_nested::function()`");
        }

        // Private parent items will still restrict the visibility of a child item,
        // even if it is declared as visible within a bigger scope.
        #[allow(dead_code)]
        pub(crate) fn restricted_function() {
            println!("called `my_mod::private_nested::restricted_function()`");
        }
    }
}

fn function() {
    println!("called `function()`");
}

fn main() {
    // Modules allow disambiguation between items that have the same name.
    function();
    my_mod::function();

    // Public items, including those inside nested modules, can be
    // accessed from outside the parent module.
    my_mod::indirect_access();
    my_mod::nested::function();
    my_mod::call_public_function_in_my_mod();

    // pub(crate) items can be called from anywhere in the same crate
    my_mod::public_function_in_crate();

    // pub(in path) items can only be called from within the module specified
    // Error! function `public_function_in_my_mod` is private
    //my_mod::nested::public_function_in_my_mod();
    // TODO ^ Try uncommenting this line

    // Private items of a module cannot be directly accessed, even if
    // nested in a public module:

    // Error! `private_function` is private
    //my_mod::private_function();
    // TODO ^ Try uncommenting this line

    // Error! `private_function` is private
    //my_mod::nested::private_function();
    // TODO ^ Try uncommenting this line

    // Error! `private_nested` is a private module
    //my_mod::private_nested::function();
    // TODO ^ Try uncommenting this line

    // Error! `private_nested` is a private module
    //my_mod::private_nested::restricted_function();
    // TODO ^ Try uncommenting this line
}

Struct visibility

Structs have an extra level of visibility with their fields. The visibility defaults to private, and can be overridden with the pub modifier. This visibility only matters when a struct is accessed from outside the module where it is defined, and has the goal of hiding information (encapsulation).

mod my {
    // A public struct with a public field of generic type `T`
    pub struct OpenBox<T> {
        pub contents: T,
    }

    // A public struct with a private field of generic type `T`
    pub struct ClosedBox<T> {
        contents: T,
    }

    impl<T> ClosedBox<T> {
        // A public constructor method
        pub fn new(contents: T) -> ClosedBox<T> {
            ClosedBox {
                contents: contents,
            }
        }
    }
}

fn main() {
    // Public structs with public fields can be constructed as usual
    let open_box = my::OpenBox { contents: "public information" };

    // and their fields can be normally accessed.
    println!("The open box contains: {}", open_box.contents);

    // Public structs with private fields cannot be constructed using field names.
    // Error! `ClosedBox` has private fields
    //let closed_box = my::ClosedBox { contents: "classified information" };
    // TODO ^ Try uncommenting this line

    // However, structs with private fields can be created using
    // public constructors
    let _closed_box = my::ClosedBox::new("classified information");

    // and the private fields of a public struct cannot be accessed.
    // Error! The `contents` field is private
    //println!("The closed box contains: {}", _closed_box.contents);
    // TODO ^ Try uncommenting this line
}

See also:

generics and methods

The use declaration

The use declaration can be used to bind a full path to a new name, for easier access. It is often used like this:

use crate::deeply::nested::{
    my_first_function,
    my_second_function,
    AndATraitType
};

fn main() {
    my_first_function();
}

You can use the as keyword to bind imports to a different name:

// Bind the `deeply::nested::function` path to `other_function`.
use deeply::nested::function as other_function;

fn function() {
    println!("called `function()`");
}

mod deeply {
    pub mod nested {
        pub fn function() {
            println!("called `deeply::nested::function()`");
        }
    }
}

fn main() {
    // Easier access to `deeply::nested::function`
    other_function();

    println!("Entering block");
    {
        // This is equivalent to `use deeply::nested::function as function`.
        // This `function()` will shadow the outer one.
        use crate::deeply::nested::function;

        // `use` bindings have a local scope. In this case, the
        // shadowing of `function()` is only in this block.
        function();

        println!("Leaving block");
    }

    function();
}

super and self

The super and self keywords can be used in the path to remove ambiguity when accessing items and to prevent unnecessary hardcoding of paths.

fn function() {
    println!("called `function()`");
}

mod cool {
    pub fn function() {
        println!("called `cool::function()`");
    }
}

mod my {
    fn function() {
        println!("called `my::function()`");
    }
    
    mod cool {
        pub fn function() {
            println!("called `my::cool::function()`");
        }
    }
    
    pub fn indirect_call() {
        // Let's access all the functions named `function` from this scope!
        print!("called `my::indirect_call()`, that\n> ");
        
        // The `self` keyword refers to the current module scope - in this case `my`.
        // Calling `self::function()` and calling `function()` directly both give
        // the same result, because they refer to the same function.
        self::function();
        function();
        
        // We can also use `self` to access another module inside `my`:
        self::cool::function();
        
        // The `super` keyword refers to the parent scope (outside the `my` module).
        super::function();
        
        // This will bind to the `cool::function` in the *crate* scope.
        // In this case the crate scope is the outermost scope.
        {
            use crate::cool::function as root_function;
            root_function();
        }
    }
}

fn main() {
    my::indirect_call();
}

File hierarchy

Modules can be mapped to a file/directory hierarchy. Let's break down the visibility example in files:

$ tree .
.
├── my
│   ├── inaccessible.rs
│   └── nested.rs
├── my.rs
└── split.rs

In split.rs:

// This declaration will look for a file named `my.rs` and will
// insert its contents inside a module named `my` under this scope
mod my;

fn function() {
    println!("called `function()`");
}

fn main() {
    my::function();

    function();

    my::indirect_access();

    my::nested::function();
}

In my.rs:

// Similarly `mod inaccessible` and `mod nested` will locate the `nested.rs`
// and `inaccessible.rs` files and insert them here under their respective
// modules
mod inaccessible;
pub mod nested;

pub fn function() {
    println!("called `my::function()`");
}

fn private_function() {
    println!("called `my::private_function()`");
}

pub fn indirect_access() {
    print!("called `my::indirect_access()`, that\n> ");

    private_function();
}

In my/nested.rs:

pub fn function() {
    println!("called `my::nested::function()`");
}

#[allow(dead_code)]
fn private_function() {
    println!("called `my::nested::private_function()`");
}

In my/inaccessible.rs:

#[allow(dead_code)]
pub fn public_function() {
    println!("called `my::inaccessible::public_function()`");
}

Let's check that things still work as before:

$ rustc split.rs && ./split
called `my::function()`
called `function()`
called `my::indirect_access()`, that
> called `my::private_function()`
called `my::nested::function()`

Crates

A crate is a compilation unit in Rust. Whenever rustc some_file.rs is called, some_file.rs is treated as the crate file. If some_file.rs has mod declarations in it, then the contents of the module files would be inserted in places where mod declarations in the crate file are found, before running the compiler over it. In other words, modules do not get compiled individually, only crates get compiled.

A crate can be compiled into a binary or into a library. By default, rustc will produce a binary from a crate. This behavior can be overridden by passing the --crate-type flag to lib.

Creating a Library

Let's create a library, and then see how to link it to another crate.

In rary.rs:

pub fn public_function() {
    println!("called rary's `public_function()`");
}

fn private_function() {
    println!("called rary's `private_function()`");
}

pub fn indirect_access() {
    print!("called rary's `indirect_access()`, that\n> ");

    private_function();
}
$ rustc --crate-type=lib rary.rs
$ ls lib*
library.rlib

Libraries get prefixed with "lib", and by default they get named after their crate file, but this default name can be overridden by passing the --crate-name option to rustc or by using the crate_name attribute.

Using a Library

To link a crate to this new library you may use rustc's --extern flag. All of its items will then be imported under a module named the same as the library. This module generally behaves the same way as any other module.

// extern crate rary; // May be required for Rust 2015 edition or earlier

fn main() {
    rary::public_function();

    // Error! `private_function` is private
    //rary::private_function();

    rary::indirect_access();
}
# Where library.rlib is the path to the compiled library, assumed that it's
# in the same directory here:
$ rustc executable.rs --extern rary=library.rlib && ./executable 
called rary's `public_function()`
called rary's `indirect_access()`, that
> called rary's `private_function()`

Cargo

cargo is the official Rust package management tool. It has lots of really useful features to improve code quality and developer velocity! These include:

  • Dependency management and integration with crates.io (the official Rust package registry)
  • Awareness of unit tests
  • Awareness of benchmarks

This chapter will go through some quick basics, but you can find the comprehensive docs in The Cargo Book.

Dependencies

Most programs have dependencies on some libraries. If you have ever managed dependencies by hand, you know how much of a pain this can be. Luckily, the Rust ecosystem comes standard with cargo! cargo can manage dependencies for a project.

To create a new Rust project,

# A binary
cargo new foo

# A library
cargo new --lib bar

For the rest of this chapter, let's assume we are making a binary, rather than a library, but all of the concepts are the same.

After the above commands, you should see a file hierarchy like this:

.
├── bar
│   ├── Cargo.toml
│   └── src
│       └── lib.rs
└── foo
    ├── Cargo.toml
    └── src
        └── main.rs

The main.rs is the root source file for your new foo project -- nothing new there. The Cargo.toml is the config file for cargo for this project. If you look inside it, you should see something like this:

[package]
name = "foo"
version = "0.1.0"
authors = ["mark"]

[dependencies]

The name field under [package] determines the name of the project. This is used by crates.io if you publish the crate (more later). It is also the name of the output binary when you compile.

The version field is a crate version number using Semantic Versioning.

The authors field is a list of authors used when publishing the crate.

The [dependencies] section lets you add dependencies for your project.

For example, suppose that we want our program to have a great CLI. You can find lots of great packages on crates.io (the official Rust package registry). One popular choice is clap. As of this writing, the most recent published version of clap is 2.27.1. To add a dependency to our program, we can simply add the following to our Cargo.toml under [dependencies]: clap = "2.27.1". And that's it! You can start using clap in your program.

cargo also supports other types of dependencies. Here is just a small sampling:

[package]
name = "foo"
version = "0.1.0"
authors = ["mark"]

[dependencies]
clap = "2.27.1" # from crates.io
rand = { git = "https://github.com/rust-lang-nursery/rand" } # from online repo
bar = { path = "../bar" } # from a path in the local filesystem

cargo is more than a dependency manager. All of the available configuration options are listed in the format specification of Cargo.toml.

To build our project we can execute cargo build anywhere in the project directory (including subdirectories!). We can also do cargo run to build and run. Notice that these commands will resolve all dependencies, download crates if needed, and build everything, including your crate. (Note that it only rebuilds what it has not already built, similar to make).

Voila! That's all there is to it!

Conventions

In the previous chapter, we saw the following directory hierarchy:

foo
├── Cargo.toml
└── src
    └── main.rs

Suppose that we wanted to have two binaries in the same project, though. What then?

It turns out that cargo supports this. The default binary name is main, as we saw before, but you can add additional binaries by placing them in a bin/ directory:

foo
├── Cargo.toml
└── src
    ├── main.rs
    └── bin
        └── my_other_bin.rs

To tell cargo to only compile or run this binary, we just pass cargo the --bin my_other_bin flag, where my_other_bin is the name of the binary we want to work with.

In addition to extra binaries, cargo supports more features such as benchmarks, tests, and examples.

In the next chapter, we will look more closely at tests.

Testing

As we know testing is integral to any piece of software! Rust has first-class support for unit and integration testing (see this chapter in TRPL).

From the testing chapters linked above, we see how to write unit tests and integration tests. Organizationally, we can place unit tests in the modules they test and integration tests in their own tests/ directory:

foo
├── Cargo.toml
├── src
│   └── main.rs
│   └── lib.rs
└── tests
    ├── my_test.rs
    └── my_other_test.rs

Each file in tests is a separate integration test, i.e. a test that is meant to test your library as if it were being called from a dependent crate.

The Testing chapter elaborates on the three different testing styles: Unit, Doc, and Integration.

cargo naturally provides an easy way to run all of your tests!

$ cargo test

You should see output like this:

$ cargo test
   Compiling blah v0.1.0 (file:///nobackup/blah)
    Finished dev [unoptimized + debuginfo] target(s) in 0.89 secs
     Running target/debug/deps/blah-d3b32b97275ec472

running 3 tests
test test_bar ... ok
test test_baz ... ok
test test_foo_bar ... ok
test test_foo ... ok

test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

You can also run tests whose name matches a pattern:

$ cargo test test_foo
$ cargo test test_foo
   Compiling blah v0.1.0 (file:///nobackup/blah)
    Finished dev [unoptimized + debuginfo] target(s) in 0.35 secs
     Running target/debug/deps/blah-d3b32b97275ec472

running 2 tests
test test_foo ... ok
test test_foo_bar ... ok

test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 2 filtered out

One word of caution: Cargo may run multiple tests concurrently, so make sure that they don't race with each other.

One example of this concurrency causing issues is if two tests output to a file, such as below:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    // Import the necessary modules
    use std::fs::OpenOptions;
    use std::io::Write;

    // This test writes to a file
    #[test]
    fn test_file() {
        // Opens the file ferris.txt or creates one if it doesn't exist.
        let mut file = OpenOptions::new()
            .append(true)
            .create(true)
            .open("ferris.txt")
            .expect("Failed to open ferris.txt");

        // Print "Ferris" 5 times.
        for _ in 0..5 {
            file.write_all("Ferris\n".as_bytes())
                .expect("Could not write to ferris.txt");
        }
    }

    // This test tries to write to the same file
    #[test]
    fn test_file_also() {
        // Opens the file ferris.txt or creates one if it doesn't exist.
        let mut file = OpenOptions::new()
            .append(true)
            .create(true)
            .open("ferris.txt")
            .expect("Failed to open ferris.txt");

        // Print "Corro" 5 times.
        for _ in 0..5 {
            file.write_all("Corro\n".as_bytes())
                .expect("Could not write to ferris.txt");
        }
    }
}
}

Although the intent is to get the following:

$ cat ferris.txt
Ferris
Ferris
Ferris
Ferris
Ferris
Corro
Corro
Corro
Corro
Corro

What actually gets put into ferris.txt is this:

$ cargo test test_foo
Corro
Ferris
Corro
Ferris
Corro
Ferris
Corro
Ferris
Corro
Ferris

Build Scripts

Sometimes a normal build from cargo is not enough. Perhaps your crate needs some pre-requisites before cargo will successfully compile, things like code generation, or some native code that needs to be compiled. To solve this problem we have build scripts that Cargo can run.

To add a build script to your package it can either be specified in the Cargo.toml as follows:

[package]
...
build = "build.rs"

Otherwise Cargo will look for a build.rs file in the project directory by default.

How to use a build script

The build script is simply another Rust file that will be compiled and invoked prior to compiling anything else in the package. Hence it can be used to fulfill pre-requisites of your crate.

Cargo provides the script with inputs via environment variables specified here that can be used.

The script provides output via stdout. All lines printed are written to target/debug/build/<pkg>/output. Further, lines prefixed with cargo: will be interpreted by Cargo directly and hence can be used to define parameters for the package's compilation.

For further specification and examples have a read of the Cargo specification.

Attributes

An attribute is metadata applied to some module, crate or item. This metadata can be used to/for:

When attributes apply to a whole crate, their syntax is #![crate_attribute], and when they apply to a module or item, the syntax is #[item_attribute] (notice the missing bang !).

Attributes can take arguments with different syntaxes:

  • #[attribute = "value"]
  • #[attribute(key = "value")]
  • #[attribute(value)]

Attributes can have multiple values and can be separated over multiple lines, too:

#[attribute(value, value2)]


#[attribute(value, value2, value3,
            value4, value5)]

dead_code

The compiler provides a dead_code lint that will warn about unused functions. An attribute can be used to disable the lint.

fn used_function() {}

// `#[allow(dead_code)]` is an attribute that disables the `dead_code` lint
#[allow(dead_code)]
fn unused_function() {}

fn noisy_unused_function() {}
// FIXME ^ Add an attribute to suppress the warning

fn main() {
    used_function();
}

Note that in real programs, you should eliminate dead code. In these examples we'll allow dead code in some places because of the interactive nature of the examples.

Crates

The crate_type attribute can be used to tell the compiler whether a crate is a binary or a library (and even which type of library), and the crate_name attribute can be used to set the name of the crate.

However, it is important to note that both the crate_type and crate_name attributes have no effect whatsoever when using Cargo, the Rust package manager. Since Cargo is used for the majority of Rust projects, this means real-world uses of crate_type and crate_name are relatively limited.

// This crate is a library
#![crate_type = "lib"]
// The library is named "rary"
#![crate_name = "rary"]

pub fn public_function() {
    println!("called rary's `public_function()`");
}

fn private_function() {
    println!("called rary's `private_function()`");
}

pub fn indirect_access() {
    print!("called rary's `indirect_access()`, that\n> ");

    private_function();
}

When the crate_type attribute is used, we no longer need to pass the --crate-type flag to rustc.

$ rustc lib.rs
$ ls lib*
library.rlib

cfg

Configuration conditional checks are possible through two different operators:

  • the cfg attribute: #[cfg(...)] in attribute position
  • the cfg! macro: cfg!(...) in boolean expressions

While the former enables conditional compilation, the latter conditionally evaluates to true or false literals allowing for checks at run-time. Both utilize identical argument syntax.

cfg!, unlike #[cfg], does not remove any code and only evaluates to true or false. For example, all blocks in an if/else expression need to be valid when cfg! is used for the condition, regardless of what cfg! is evaluating.

// This function only gets compiled if the target OS is linux
#[cfg(target_os = "linux")]
fn are_you_on_linux() {
    println!("You are running linux!");
}

// And this function only gets compiled if the target OS is *not* linux
#[cfg(not(target_os = "linux"))]
fn are_you_on_linux() {
    println!("You are *not* running linux!");
}

fn main() {
    are_you_on_linux();

    println!("Are you sure?");
    if cfg!(target_os = "linux") {
        println!("Yes. It's definitely linux!");
    } else {
        println!("Yes. It's definitely *not* linux!");
    }
}

See also:

the reference, cfg!, and macros.

Custom

Some conditionals like target_os are implicitly provided by rustc, but custom conditionals must be passed to rustc using the --cfg flag.

#[cfg(some_condition)]
fn conditional_function() {
    println!("condition met!");
}

fn main() {
    conditional_function();
}

Try to run this to see what happens without the custom cfg flag.

With the custom cfg flag:

$ rustc --cfg some_condition custom.rs && ./custom
condition met!

Generics

Generics is the topic of generalizing types and functionalities to broader cases. This is extremely useful for reducing code duplication in many ways, but can call for rather involved syntax. Namely, being generic requires taking great care to specify over which types a generic type is actually considered valid. The simplest and most common use of generics is for type parameters.

A type parameter is specified as generic by the use of angle brackets and upper camel case: <Aaa, Bbb, ...>. "Generic type parameters" are typically represented as <T>. In Rust, "generic" also describes anything that accepts one or more generic type parameters <T>. Any type specified as a generic type parameter is generic, and everything else is concrete (non-generic).

For example, defining a generic function named foo that takes an argument T of any type:

fn foo<T>(arg: T) { ... }

Because T has been specified as a generic type parameter using <T>, it is considered generic when used here as (arg: T). This is the case even if T has previously been defined as a struct.

This example shows some of the syntax in action:

// A concrete type `A`.
struct A;

// In defining the type `Single`, the first use of `A` is not preceded by `<A>`.
// Therefore, `Single` is a concrete type, and `A` is defined as above.
struct Single(A);
//            ^ Here is `Single`s first use of the type `A`.

// Here, `<T>` precedes the first use of `T`, so `SingleGen` is a generic type.
// Because the type parameter `T` is generic, it could be anything, including
// the concrete type `A` defined at the top.
struct SingleGen<T>(T);

fn main() {
    // `Single` is concrete and explicitly takes `A`.
    let _s = Single(A);
    
    // Create a variable `_char` of type `SingleGen<char>`
    // and give it the value `SingleGen('a')`.
    // Here, `SingleGen` has a type parameter explicitly specified.
    let _char: SingleGen<char> = SingleGen('a');

    // `SingleGen` can also have a type parameter implicitly specified:
    let _t    = SingleGen(A); // Uses `A` defined at the top.
    let _i32  = SingleGen(6); // Uses `i32`.
    let _char = SingleGen('a'); // Uses `char`.
}

See also:

structs

Functions

The same set of rules can be applied to functions: a type T becomes generic when preceded by <T>.

Using generic functions sometimes requires explicitly specifying type parameters. This may be the case if the function is called where the return type is generic, or if the compiler doesn't have enough information to infer the necessary type parameters.

A function call with explicitly specified type parameters looks like: fun::<A, B, ...>().

struct A;          // Concrete type `A`.
struct S(A);       // Concrete type `S`.
struct SGen<T>(T); // Generic type `SGen`.

// The following functions all take ownership of the variable passed into
// them and immediately go out of scope, freeing the variable.

// Define a function `reg_fn` that takes an argument `_s` of type `S`.
// This has no `<T>` so this is not a generic function.
fn reg_fn(_s: S) {}

// Define a function `gen_spec_t` that takes an argument `_s` of type `SGen<T>`.
// It has been explicitly given the type parameter `A`, but because `A` has not 
// been specified as a generic type parameter for `gen_spec_t`, it is not generic.
fn gen_spec_t(_s: SGen<A>) {}

// Define a function `gen_spec_i32` that takes an argument `_s` of type `SGen<i32>`.
// It has been explicitly given the type parameter `i32`, which is a specific type.
// Because `i32` is not a generic type, this function is also not generic.
fn gen_spec_i32(_s: SGen<i32>) {}

// Define a function `generic` that takes an argument `_s` of type `SGen<T>`.
// Because `SGen<T>` is preceded by `<T>`, this function is generic over `T`.
fn generic<T>(_s: SGen<T>) {}

fn main() {
    // Using the non-generic functions
    reg_fn(S(A));          // Concrete type.
    gen_spec_t(SGen(A));   // Implicitly specified type parameter `A`.
    gen_spec_i32(SGen(6)); // Implicitly specified type parameter `i32`.

    // Explicitly specified type parameter `char` to `generic()`.
    generic::<char>(SGen('a'));

    // Implicitly specified type parameter `char` to `generic()`.
    generic(SGen('c'));
}

See also:

functions and structs

Implementation

Similar to functions, implementations require care to remain generic.

#![allow(unused)]
fn main() {
struct S; // Concrete type `S`
struct GenericVal<T>(T); // Generic type `GenericVal`

// impl of GenericVal where we explicitly specify type parameters:
impl GenericVal<f32> {} // Specify `f32`
impl GenericVal<S> {} // Specify `S` as defined above

// `<T>` Must precede the type to remain generic
impl<T> GenericVal<T> {}
}
struct Val {
    val: f64,
}

struct GenVal<T> {
    gen_val: T,
}

// impl of Val
impl Val {
    fn value(&self) -> &f64 {
        &self.val
    }
}

// impl of GenVal for a generic type `T`
impl<T> GenVal<T> {
    fn value(&self) -> &T {
        &self.gen_val
    }
}

fn main() {
    let x = Val { val: 3.0 };
    let y = GenVal { gen_val: 3i32 };

    println!("{}, {}", x.value(), y.value());
}

See also:

functions returning references, impl, and struct

Traits

Of course traits can also be generic. Here we define one which reimplements the Drop trait as a generic method to drop itself and an input.

// Non-copyable types.
struct Empty;
struct Null;

// A trait generic over `T`.
trait DoubleDrop<T> {
    // Define a method on the caller type which takes an
    // additional single parameter `T` and does nothing with it.
    fn double_drop(self, _: T);
}

// Implement `DoubleDrop<T>` for any generic parameter `T` and
// caller `U`.
impl<T, U> DoubleDrop<T> for U {
    // This method takes ownership of both passed arguments,
    // deallocating both.
    fn double_drop(self, _: T) {}
}

fn main() {
    let empty = Empty;
    let null  = Null;

    // Deallocate `empty` and `null`.
    empty.double_drop(null);

    //empty;
    //null;
    // ^ TODO: Try uncommenting these lines.
}

See also:

Drop, struct, and trait

Bounds

When working with generics, the type parameters often must use traits as bounds to stipulate what functionality a type implements. For example, the following example uses the trait Display to print and so it requires T to be bound by Display; that is, T must implement Display.

// Define a function `printer` that takes a generic type `T` which
// must implement trait `Display`.
fn printer<T: Display>(t: T) {
    println!("{}", t);
}

Bounding restricts the generic to types that conform to the bounds. That is:

struct S<T: Display>(T);

// Error! `Vec<T>` does not implement `Display`. This
// specialization will fail.
let s = S(vec![1]);

Another effect of bounding is that generic instances are allowed to access the methods of traits specified in the bounds. For example:

// A trait which implements the print marker: `{:?}`.
use std::fmt::Debug;

trait HasArea {
    fn area(&self) -> f64;
}

impl HasArea for Rectangle {
    fn area(&self) -> f64 { self.length * self.height }
}

#[derive(Debug)]
struct Rectangle { length: f64, height: f64 }
#[allow(dead_code)]
struct Triangle  { length: f64, height: f64 }

// The generic `T` must implement `Debug`. Regardless
// of the type, this will work properly.
fn print_debug<T: Debug>(t: &T) {
    println!("{:?}", t);
}

// `T` must implement `HasArea`. Any type which meets
// the bound can access `HasArea`'s function `area`.
fn area<T: HasArea>(t: &T) -> f64 { t.area() }

fn main() {
    let rectangle = Rectangle { length: 3.0, height: 4.0 };
    let _triangle = Triangle  { length: 3.0, height: 4.0 };

    print_debug(&rectangle);
    println!("Area: {}", area(&rectangle));

    //print_debug(&_triangle);
    //println!("Area: {}", area(&_triangle));
    // ^ TODO: Try uncommenting these.
    // | Error: Does not implement either `Debug` or `HasArea`. 
}

As an additional note, where clauses can also be used to apply bounds in some cases to be more expressive.

See also:

std::fmt, structs, and traits

Testcase: empty bounds

A consequence of how bounds work is that even if a trait doesn't include any functionality, you can still use it as a bound. Eq and Copy are examples of such traits from the std library.

struct Cardinal;
struct BlueJay;
struct Turkey;

trait Red {}
trait Blue {}

impl Red for Cardinal {}
impl Blue for BlueJay {}

// These functions are only valid for types which implement these
// traits. The fact that the traits are empty is irrelevant.
fn red<T: Red>(_: &T)   -> &'static str { "red" }
fn blue<T: Blue>(_: &T) -> &'static str { "blue" }

fn main() {
    let cardinal = Cardinal;
    let blue_jay = BlueJay;
    let _turkey   = Turkey;

    // `red()` won't work on a blue jay nor vice versa
    // because of the bounds.
    println!("A cardinal is {}", red(&cardinal));
    println!("A blue jay is {}", blue(&blue_jay));
    //println!("A turkey is {}", red(&_turkey));
    // ^ TODO: Try uncommenting this line.
}

See also:

std::cmp::Eq, std::marker::Copy, and traits

Multiple bounds

Multiple bounds for a single type can be applied with a +. Like normal, different types are separated with ,.

use std::fmt::{Debug, Display};

fn compare_prints<T: Debug + Display>(t: &T) {
    println!("Debug: `{:?}`", t);
    println!("Display: `{}`", t);
}

fn compare_types<T: Debug, U: Debug>(t: &T, u: &U) {
    println!("t: `{:?}`", t);
    println!("u: `{:?}`", u);
}

fn main() {
    let string = "words";
    let array = [1, 2, 3];
    let vec = vec![1, 2, 3];

    compare_prints(&string);
    //compare_prints(&array);
    // TODO ^ Try uncommenting this.

    compare_types(&array, &vec);
}

See also:

std::fmt and traits

Where clauses

A bound can also be expressed using a where clause immediately before the opening {, rather than at the type's first mention. Additionally, where clauses can apply bounds to arbitrary types, rather than just to type parameters.

Some cases that a where clause is useful:

  • When specifying generic types and bounds separately is clearer:
impl <A: TraitB + TraitC, D: TraitE + TraitF> MyTrait<A, D> for YourType {}

// Expressing bounds with a `where` clause
impl <A, D> MyTrait<A, D> for YourType where
    A: TraitB + TraitC,
    D: TraitE + TraitF {}
  • When using a where clause is more expressive than using normal syntax. The impl in this example cannot be directly expressed without a where clause:
use std::fmt::Debug;

trait PrintInOption {
    fn print_in_option(self);
}

// Because we would otherwise have to express this as `T: Debug` or 
// use another method of indirect approach, this requires a `where` clause:
impl<T> PrintInOption for T where
    Option<T>: Debug {
    // We want `Option<T>: Debug` as our bound because that is what's
    // being printed. Doing otherwise would be using the wrong bound.
    fn print_in_option(self) {
        println!("{:?}", Some(self));
    }
}

fn main() {
    let vec = vec![1, 2, 3];

    vec.print_in_option();
}

See also:

RFC, struct, and trait

New Type Idiom

The newtype idiom gives compile time guarantees that the right type of value is supplied to a program.

For example, an age verification function that checks age in years, must be given a value of type Years.

struct Years(i64);

struct Days(i64);

impl Years {
    pub fn to_days(&self) -> Days {
        Days(self.0 * 365)
    }
}


impl Days {
    /// truncates partial years
    pub fn to_years(&self) -> Years {
        Years(self.0 / 365)
    }
}

fn old_enough(age: &Years) -> bool {
    age.0 >= 18
}

fn main() {
    let age = Years(5);
    let age_days = age.to_days();
    println!("Old enough {}", old_enough(&age));
    println!("Old enough {}", old_enough(&age_days.to_years()));
    // println!("Old enough {}", old_enough(&age_days));
}

Uncomment the last print statement to observe that the type supplied must be Years.

To obtain the newtype's value as the base type, you may use the tuple or destructuring syntax like so:

struct Years(i64);

fn main() {
    let years = Years(42);
    let years_as_primitive_1: i64 = years.0; // Tuple
    let Years(years_as_primitive_2) = years; // Destructuring
}

See also:

structs

Associated items

"Associated Items" refers to a set of rules pertaining to items of various types. It is an extension to trait generics, and allows traits to internally define new items.

One such item is called an associated type, providing simpler usage patterns when the trait is generic over its container type.

See also:

RFC

The Problem

A trait that is generic over its container type has type specification requirements - users of the trait must specify all of its generic types.

In the example below, the Contains trait allows the use of the generic types A and B. The trait is then implemented for the Container type, specifying i32 for A and B so that it can be used with fn difference().

Because Contains is generic, we are forced to explicitly state all of the generic types for fn difference(). In practice, we want a way to express that A and B are determined by the input C. As you will see in the next section, associated types provide exactly that capability.

struct Container(i32, i32);

// A trait which checks if 2 items are stored inside of container.
// Also retrieves first or last value.
trait Contains<A, B> {
    fn contains(&self, _: &A, _: &B) -> bool; // Explicitly requires `A` and `B`.
    fn first(&self) -> i32; // Doesn't explicitly require `A` or `B`.
    fn last(&self) -> i32;  // Doesn't explicitly require `A` or `B`.
}

impl Contains<i32, i32> for Container {
    // True if the numbers stored are equal.
    fn contains(&self, number_1: &i32, number_2: &i32) -> bool {
        (&self.0 == number_1) && (&self.1 == number_2)
    }

    // Grab the first number.
    fn first(&self) -> i32 { self.0 }

    // Grab the last number.
    fn last(&self) -> i32 { self.1 }
}

// `C` contains `A` and `B`. In light of that, having to express `A` and
// `B` again is a nuisance.
fn difference<A, B, C>(container: &C) -> i32 where
    C: Contains<A, B> {
    container.last() - container.first()
}

fn main() {
    let number_1 = 3;
    let number_2 = 10;

    let container = Container(number_1, number_2);

    println!("Does container contain {} and {}: {}",
        &number_1, &number_2,
        container.contains(&number_1, &number_2));
    println!("First number: {}", container.first());
    println!("Last number: {}", container.last());

    println!("The difference is: {}", difference(&container));
}

See also:

structs, and traits

Associated types

The use of "Associated types" improves the overall readability of code by moving inner types locally into a trait as output types. Syntax for the trait definition is as follows:

#![allow(unused)]
fn main() {
// `A` and `B` are defined in the trait via the `type` keyword.
// (Note: `type` in this context is different from `type` when used for
// aliases).
trait Contains {
    type A;
    type B;

    // Updated syntax to refer to these new types generically.
    fn contains(&self, _: &Self::A, _: &Self::B) -> bool;
}
}

Note that functions that use the trait Contains are no longer required to express A or B at all:

// Without using associated types
fn difference<A, B, C>(container: &C) -> i32 where
    C: Contains<A, B> { ... }

// Using associated types
fn difference<C: Contains>(container: &C) -> i32 { ... }

Let's rewrite the example from the previous section using associated types:

struct Container(i32, i32);

// A trait which checks if 2 items are stored inside of container.
// Also retrieves first or last value.
trait Contains {
    // Define generic types here which methods will be able to utilize.
    type A;
    type B;

    fn contains(&self, _: &Self::A, _: &Self::B) -> bool;
    fn first(&self) -> i32;
    fn last(&self) -> i32;
}

impl Contains for Container {
    // Specify what types `A` and `B` are. If the `input` type
    // is `Container(i32, i32)`, the `output` types are determined
    // as `i32` and `i32`.
    type A = i32;
    type B = i32;

    // `&Self::A` and `&Self::B` are also valid here.
    fn contains(&self, number_1: &i32, number_2: &i32) -> bool {
        (&self.0 == number_1) && (&self.1 == number_2)
    }
    // Grab the first number.
    fn first(&self) -> i32 { self.0 }

    // Grab the last number.
    fn last(&self) -> i32 { self.1 }
}

fn difference<C: Contains>(container: &C) -> i32 {
    container.last() - container.first()
}

fn main() {
    let number_1 = 3;
    let number_2 = 10;

    let container = Container(number_1, number_2);

    println!("Does container contain {} and {}: {}",
        &number_1, &number_2,
        container.contains(&number_1, &number_2));
    println!("First number: {}", container.first());
    println!("Last number: {}", container.last());
    
    println!("The difference is: {}", difference(&container));
}

Phantom type parameters

A phantom type parameter is one that doesn't show up at runtime, but is checked statically (and only) at compile time.

Data types can use extra generic type parameters to act as markers or to perform type checking at compile time. These extra parameters hold no storage values, and have no runtime behavior.

In the following example, we combine std::marker::PhantomData with the phantom type parameter concept to create tuples containing different data types.

use std::marker::PhantomData;

// A phantom tuple struct which is generic over `A` with hidden parameter `B`.
#[derive(PartialEq)] // Allow equality test for this type.
struct PhantomTuple<A, B>(A, PhantomData<B>);

// A phantom type struct which is generic over `A` with hidden parameter `B`.
#[derive(PartialEq)] // Allow equality test for this type.
struct PhantomStruct<A, B> { first: A, phantom: PhantomData<B> }

// Note: Storage is allocated for generic type `A`, but not for `B`.
//       Therefore, `B` cannot be used in computations.

fn main() {
    // Here, `f32` and `f64` are the hidden parameters.
    // PhantomTuple type specified as `<char, f32>`.
    let _tuple1: PhantomTuple<char, f32> = PhantomTuple('Q', PhantomData);
    // PhantomTuple type specified as `<char, f64>`.
    let _tuple2: PhantomTuple<char, f64> = PhantomTuple('Q', PhantomData);

    // Type specified as `<char, f32>`.
    let _struct1: PhantomStruct<char, f32> = PhantomStruct {
        first: 'Q',
        phantom: PhantomData,
    };
    // Type specified as `<char, f64>`.
    let _struct2: PhantomStruct<char, f64> = PhantomStruct {
        first: 'Q',
        phantom: PhantomData,
    };

    // Compile-time Error! Type mismatch so these cannot be compared:
    // println!("_tuple1 == _tuple2 yields: {}",
    //           _tuple1 == _tuple2);

    // Compile-time Error! Type mismatch so these cannot be compared:
    // println!("_struct1 == _struct2 yields: {}",
    //           _struct1 == _struct2);
}

See also:

Derive, struct, and TupleStructs

Testcase: unit clarification

A useful method of unit conversions can be examined by implementing Add with a phantom type parameter. The Add trait is examined below:

// This construction would impose: `Self + RHS = Output`
// where RHS defaults to Self if not specified in the implementation.
pub trait Add<RHS = Self> {
    type Output;

    fn add(self, rhs: RHS) -> Self::Output;
}

// `Output` must be `T<U>` so that `T<U> + T<U> = T<U>`.
impl<U> Add for T<U> {
    type Output = T<U>;
    ...
}

The whole implementation:

use std::ops::Add;
use std::marker::PhantomData;

/// Create void enumerations to define unit types.
#[derive(Debug, Clone, Copy)]
enum Inch {}
#[derive(Debug, Clone, Copy)]
enum Mm {}

/// `Length` is a type with phantom type parameter `Unit`,
/// and is not generic over the length type (that is `f64`).
///
/// `f64` already implements the `Clone` and `Copy` traits.
#[derive(Debug, Clone, Copy)]
struct Length<Unit>(f64, PhantomData<Unit>);

/// The `Add` trait defines the behavior of the `+` operator.
impl<Unit> Add for Length<Unit> {
    type Output = Length<Unit>;

    // add() returns a new `Length` struct containing the sum.
    fn add(self, rhs: Length<Unit>) -> Length<Unit> {
        // `+` calls the `Add` implementation for `f64`.
        Length(self.0 + rhs.0, PhantomData)
    }
}

fn main() {
    // Specifies `one_foot` to have phantom type parameter `Inch`.
    let one_foot:  Length<Inch> = Length(12.0, PhantomData);
    // `one_meter` has phantom type parameter `Mm`.
    let one_meter: Length<Mm>   = Length(1000.0, PhantomData);

    // `+` calls the `add()` method we implemented for `Length<Unit>`.
    //
    // Since `Length` implements `Copy`, `add()` does not consume
    // `one_foot` and `one_meter` but copies them into `self` and `rhs`.
    let two_feet = one_foot + one_foot;
    let two_meters = one_meter + one_meter;

    // Addition works.
    println!("one foot + one_foot = {:?} in", two_feet.0);
    println!("one meter + one_meter = {:?} mm", two_meters.0);

    // Nonsensical operations fail as they should:
    // Compile-time Error: type mismatch.
    //let one_feter = one_foot + one_meter;
}

See also:

Borrowing (&), Bounds (X: Y), enum, impl & self, Overloading, ref, Traits (X for Y), and TupleStructs.

Scoping rules

Scopes play an important part in ownership, borrowing, and lifetimes. That is, they indicate to the compiler when borrows are valid, when resources can be freed, and when variables are created or destroyed.

RAII

Variables in Rust do more than just hold data in the stack: they also own resources, e.g. Box<T> owns memory in the heap. Rust enforces RAII (Resource Acquisition Is Initialization), so whenever an object goes out of scope, its destructor is called and its owned resources are freed.

This behavior shields against resource leak bugs, so you'll never have to manually free memory or worry about memory leaks again! Here's a quick showcase:

// raii.rs
fn create_box() {
    // Allocate an integer on the heap
    let _box1 = Box::new(3i32);

    // `_box1` is destroyed here, and memory gets freed
}

fn main() {
    // Allocate an integer on the heap
    let _box2 = Box::new(5i32);

    // A nested scope:
    {
        // Allocate an integer on the heap
        let _box3 = Box::new(4i32);

        // `_box3` is destroyed here, and memory gets freed
    }

    // Creating lots of boxes just for fun
    // There's no need to manually free memory!
    for _ in 0u32..1_000 {
        create_box();
    }

    // `_box2` is destroyed here, and memory gets freed
}

Of course, we can double check for memory errors using valgrind:

$ rustc raii.rs && valgrind ./raii
==26873== Memcheck, a memory error detector
==26873== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==26873== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
==26873== Command: ./raii
==26873==
==26873==
==26873== HEAP SUMMARY:
==26873==     in use at exit: 0 bytes in 0 blocks
==26873==   total heap usage: 1,013 allocs, 1,013 frees, 8,696 bytes allocated
==26873==
==26873== All heap blocks were freed -- no leaks are possible
==26873==
==26873== For counts of detected and suppressed errors, rerun with: -v
==26873== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)

No leaks here!

Destructor

The notion of a destructor in Rust is provided through the Drop trait. The destructor is called when the resource goes out of scope. This trait is not required to be implemented for every type, only implement it for your type if you require its own destructor logic.

Run the below example to see how the Drop trait works. When the variable in the main function goes out of scope the custom destructor will be invoked.

struct ToDrop;

impl Drop for ToDrop {
    fn drop(&mut self) {
        println!("ToDrop is being dropped");
    }
}

fn main() {
    let x = ToDrop;
    println!("Made a ToDrop!");
}

See also:

Box

Ownership and moves

Because variables are in charge of freeing their own resources, resources can only have one owner. This also prevents resources from being freed more than once. Note that not all variables own resources (e.g. references).

When doing assignments (let x = y) or passing function arguments by value (foo(x)), the ownership of the resources is transferred. In Rust-speak, this is known as a move.

After moving resources, the previous owner can no longer be used. This avoids creating dangling pointers.

// This function takes ownership of the heap allocated memory
fn destroy_box(c: Box<i32>) {
    println!("Destroying a box that contains {}", c);

    // `c` is destroyed and the memory freed
}

fn main() {
    // _Stack_ allocated integer
    let x = 5u32;

    // *Copy* `x` into `y` - no resources are moved
    let y = x;

    // Both values can be independently used
    println!("x is {}, and y is {}", x, y);

    // `a` is a pointer to a _heap_ allocated integer
    let a = Box::new(5i32);

    println!("a contains: {}", a);

    // *Move* `a` into `b`
    let b = a;
    // The pointer address of `a` is copied (not the data) into `b`.
    // Both are now pointers to the same heap allocated data, but
    // `b` now owns it.
    
    // Error! `a` can no longer access the data, because it no longer owns the
    // heap memory
    //println!("a contains: {}", a);
    // TODO ^ Try uncommenting this line

    // This function takes ownership of the heap allocated memory from `b`
    destroy_box(b);

    // Since the heap memory has been freed at this point, this action would
    // result in dereferencing freed memory, but it's forbidden by the compiler
    // Error! Same reason as the previous Error
    //println!("b contains: {}", b);
    // TODO ^ Try uncommenting this line
}

Mutability

Mutability of data can be changed when ownership is transferred.

fn main() {
    let immutable_box = Box::new(5u32);

    println!("immutable_box contains {}", immutable_box);

    // Mutability error
    //*immutable_box = 4;

    // *Move* the box, changing the ownership (and mutability)
    let mut mutable_box = immutable_box;

    println!("mutable_box contains {}", mutable_box);

    // Modify the contents of the box
    *mutable_box = 4;

    println!("mutable_box now contains {}", mutable_box);
}

Partial moves

Within the destructuring of a single variable, both by-move and by-reference pattern bindings can be used at the same time. Doing this will result in a partial move of the variable, which means that parts of the variable will be moved while other parts stay. In such a case, the parent variable cannot be used afterwards as a whole, however the parts that are only referenced (and not moved) can still be used.

fn main() {
    #[derive(Debug)]
    struct Person {
        name: String,
        age: Box<u8>,
    }

    let person = Person {
        name: String::from("Alice"),
        age: Box::new(20),
    };

    // `name` is moved out of person, but `age` is referenced
    let Person { name, ref age } = person;

    println!("The person's age is {}", age);

    println!("The person's name is {}", name);

    // Error! borrow of partially moved value: `person` partial move occurs
    //println!("The person struct is {:?}", person);

    // `person` cannot be used but `person.age` can be used as it is not moved
    println!("The person's age from person struct is {}", person.age);
}

(In this example, we store the age variable on the heap to illustrate the partial move: deleting ref in the above code would give an error as the ownership of person.age would be moved to the variable age. If Person.age were stored on the stack, ref would not be required as the definition of age would copy the data from person.age without moving it.)

See also:

destructuring

Borrowing

Most of the time, we'd like to access data without taking ownership over it. To accomplish this, Rust uses a borrowing mechanism. Instead of passing objects by value (T), objects can be passed by reference (&T).

The compiler statically guarantees (via its borrow checker) that references always point to valid objects. That is, while references to an object exist, the object cannot be destroyed.

// This function takes ownership of a box and destroys it
fn eat_box_i32(boxed_i32: Box<i32>) {
    println!("Destroying box that contains {}", boxed_i32);
}

// This function borrows an i32
fn borrow_i32(borrowed_i32: &i32) {
    println!("This int is: {}", borrowed_i32);
}

fn main() {
    // Create a boxed i32, and a stacked i32
    let boxed_i32 = Box::new(5_i32);
    let stacked_i32 = 6_i32;

    // Borrow the contents of the box. Ownership is not taken,
    // so the contents can be borrowed again.
    borrow_i32(&boxed_i32);
    borrow_i32(&stacked_i32);

    {
        // Take a reference to the data contained inside the box
        let _ref_to_i32: &i32 = &boxed_i32;

        // Error!
        // Can't destroy `boxed_i32` while the inner value is borrowed later in scope.
        eat_box_i32(boxed_i32);
        // FIXME ^ Comment out this line

        // Attempt to borrow `_ref_to_i32` after inner value is destroyed
        borrow_i32(_ref_to_i32);
        // `_ref_to_i32` goes out of scope and is no longer borrowed.
    }

    // `boxed_i32` can now give up ownership to `eat_box` and be destroyed
    eat_box_i32(boxed_i32);
}

Mutability

Mutable data can be mutably borrowed using &mut T. This is called a mutable reference and gives read/write access to the borrower. In contrast, &T borrows the data via an immutable reference, and the borrower can read the data but not modify it:

#[allow(dead_code)]
#[derive(Clone, Copy)]
struct Book {
    // `&'static str` is a reference to a string allocated in read only memory
    author: &'static str,
    title: &'static str,
    year: u32,
}

// This function takes a reference to a book
fn borrow_book(book: &Book) {
    println!("I immutably borrowed {} - {} edition", book.title, book.year);
}

// This function takes a reference to a mutable book and changes `year` to 2014
fn new_edition(book: &mut Book) {
    book.year = 2014;
    println!("I mutably borrowed {} - {} edition", book.title, book.year);
}

fn main() {
    // Create an immutable Book named `immutabook`
    let immutabook = Book {
        // string literals have type `&'static str`
        author: "Douglas Hofstadter",
        title: "Gödel, Escher, Bach",
        year: 1979,
    };

    // Create a mutable copy of `immutabook` and call it `mutabook`
    let mut mutabook = immutabook;
    
    // Immutably borrow an immutable object
    borrow_book(&immutabook);

    // Immutably borrow a mutable object
    borrow_book(&mutabook);
    
    // Borrow a mutable object as mutable
    new_edition(&mut mutabook);
    
    // Error! Cannot borrow an immutable object as mutable
    new_edition(&mut immutabook);
    // FIXME ^ Comment out this line
}

See also:

static

Aliasing

Data can be immutably borrowed any number of times, but while immutably borrowed, the original data can't be mutably borrowed. On the other hand, only one mutable borrow is allowed at a time. The original data can be borrowed again only after the mutable reference has been used for the last time.

struct Point { x: i32, y: i32, z: i32 }

fn main() {
    let mut point = Point { x: 0, y: 0, z: 0 };

    let borrowed_point = &point;
    let another_borrow = &point;

    // Data can be accessed via the references and the original owner
    println!("Point has coordinates: ({}, {}, {})",
                borrowed_point.x, another_borrow.y, point.z);

    // Error! Can't borrow `point` as mutable because it's currently
    // borrowed as immutable.
    // let mutable_borrow = &mut point;
    // TODO ^ Try uncommenting this line

    // The borrowed values are used again here
    println!("Point has coordinates: ({}, {}, {})",
                borrowed_point.x, another_borrow.y, point.z);

    // The immutable references are no longer used for the rest of the code so
    // it is possible to reborrow with a mutable reference.
    let mutable_borrow = &mut point;

    // Change data via mutable reference
    mutable_borrow.x = 5;
    mutable_borrow.y = 2;
    mutable_borrow.z = 1;

    // Error! Can't borrow `point` as immutable because it's currently
    // borrowed as mutable.
    // let y = &point.y;
    // TODO ^ Try uncommenting this line

    // Error! Can't print because `println!` takes an immutable reference.
    // println!("Point Z coordinate is {}", point.z);
    // TODO ^ Try uncommenting this line

    // Ok! Mutable references can be passed as immutable to `println!`
    println!("Point has coordinates: ({}, {}, {})",
                mutable_borrow.x, mutable_borrow.y, mutable_borrow.z);

    // The mutable reference is no longer used for the rest of the code so it
    // is possible to reborrow
    let new_borrowed_point = &point;
    println!("Point now has coordinates: ({}, {}, {})",
             new_borrowed_point.x, new_borrowed_point.y, new_borrowed_point.z);
}

The ref pattern

When doing pattern matching or destructuring via the let binding, the ref keyword can be used to take references to the fields of a struct/tuple. The example below shows a few instances where this can be useful:

#[derive(Clone, Copy)]
struct Point { x: i32, y: i32 }

fn main() {
    let c = 'Q';

    // A `ref` borrow on the left side of an assignment is equivalent to
    // an `&` borrow on the right side.
    let ref ref_c1 = c;
    let ref_c2 = &c;

    println!("ref_c1 equals ref_c2: {}", *ref_c1 == *ref_c2);

    let point = Point { x: 0, y: 0 };

    // `ref` is also valid when destructuring a struct.
    let _copy_of_x = {
        // `ref_to_x` is a reference to the `x` field of `point`.
        let Point { x: ref ref_to_x, y: _ } = point;

        // Return a copy of the `x` field of `point`.
        *ref_to_x
    };

    // A mutable copy of `point`
    let mut mutable_point = point;

    {
        // `ref` can be paired with `mut` to take mutable references.
        let Point { x: _, y: ref mut mut_ref_to_y } = mutable_point;

        // Mutate the `y` field of `mutable_point` via a mutable reference.
        *mut_ref_to_y = 1;
    }

    println!("point is ({}, {})", point.x, point.y);
    println!("mutable_point is ({}, {})", mutable_point.x, mutable_point.y);

    // A mutable tuple that includes a pointer
    let mut mutable_tuple = (Box::new(5u32), 3u32);
    
    {
        // Destructure `mutable_tuple` to change the value of `last`.
        let (_, ref mut last) = mutable_tuple;
        *last = 2u32;
    }
    
    println!("tuple is {:?}", mutable_tuple);
}

Lifetimes

A lifetime is a construct of the compiler (or more specifically, its borrow checker) uses to ensure all borrows are valid. Specifically, a variable's lifetime begins when it is created and ends when it is destroyed. While lifetimes and scopes are often referred to together, they are not the same.

Take, for example, the case where we borrow a variable via &. The borrow has a lifetime that is determined by where it is declared. As a result, the borrow is valid as long as it ends before the lender is destroyed. However, the scope of the borrow is determined by where the reference is used.

In the following example and in the rest of this section, we will see how lifetimes relate to scopes, as well as how the two differ.

// Lifetimes are annotated below with lines denoting the creation
// and destruction of each variable.
// `i` has the longest lifetime because its scope entirely encloses 
// both `borrow1` and `borrow2`. The duration of `borrow1` compared 
// to `borrow2` is irrelevant since they are disjoint.
fn main() {
    let i = 3; // Lifetime for `i` starts. ────────────────┐
    //                                                     │
    { //                                                   │
        let borrow1 = &i; // `borrow1` lifetime starts. ──┐│
        //                                                ││
        println!("borrow1: {}", borrow1); //              ││
    } // `borrow1` ends. ─────────────────────────────────┘│
    //                                                     │
    //                                                     │
    { //                                                   │
        let borrow2 = &i; // `borrow2` lifetime starts. ──┐│
        //                                                ││
        println!("borrow2: {}", borrow2); //              ││
    } // `borrow2` ends. ─────────────────────────────────┘│
    //                                                     │
}   // Lifetime ends. ─────────────────────────────────────┘

Note that no names or types are assigned to label lifetimes. This restricts how lifetimes will be able to be used as we will see.

Explicit annotation

The borrow checker uses explicit lifetime annotations to determine how long references should be valid. In cases where lifetimes are not elided1, Rust requires explicit annotations to determine what the lifetime of a reference should be. The syntax for explicitly annotating a lifetime uses an apostrophe character as follows:

foo<'a>
// `foo` has a lifetime parameter `'a`

Similar to closures, using lifetimes requires generics. Additionally, this lifetime syntax indicates that the lifetime of foo may not exceed that of 'a. Explicit annotation of a type has the form &'a T where 'a has already been introduced.

In cases with multiple lifetimes, the syntax is similar:

foo<'a, 'b>
// `foo` has lifetime parameters `'a` and `'b`

In this case, the lifetime of foo cannot exceed that of either 'a or 'b.

See the following example for explicit lifetime annotation in use:

// `print_refs` takes two references to `i32` which have different
// lifetimes `'a` and `'b`. These two lifetimes must both be at
// least as long as the function `print_refs`.
fn print_refs<'a, 'b>(x: &'a i32, y: &'b i32) {
    println!("x is {} and y is {}", x, y);
}

// A function which takes no arguments, but has a lifetime parameter `'a`.
fn failed_borrow<'a>() {
    let _x = 12;

    // ERROR: `_x` does not live long enough
    let y: &'a i32 = &_x;
    // Attempting to use the lifetime `'a` as an explicit type annotation 
    // inside the function will fail because the lifetime of `&_x` is shorter
    // than that of `y`. A short lifetime cannot be coerced into a longer one.
}

fn main() {
    // Create variables to be borrowed below.
    let (four, nine) = (4, 9);
    
    // Borrows (`&`) of both variables are passed into the function.
    print_refs(&four, &nine);
    // Any input which is borrowed must outlive the borrower. 
    // In other words, the lifetime of `four` and `nine` must 
    // be longer than that of `print_refs`.
    
    failed_borrow();
    // `failed_borrow` contains no references to force `'a` to be 
    // longer than the lifetime of the function, but `'a` is longer.
    // Because the lifetime is never constrained, it defaults to `'static`.
}
1

elision implicitly annotates lifetimes and so is different.

See also:

generics and closures

Functions

Ignoring elision, function signatures with lifetimes have a few constraints:

  • any reference must have an annotated lifetime.
  • any reference being returned must have the same lifetime as an input or be static.

Additionally, note that returning references without input is banned if it would result in returning references to invalid data. The following example shows off some valid forms of functions with lifetimes:

// One input reference with lifetime `'a` which must live
// at least as long as the function.
fn print_one<'a>(x: &'a i32) {
    println!("`print_one`: x is {}", x);
}

// Mutable references are possible with lifetimes as well.
fn add_one<'a>(x: &'a mut i32) {
    *x += 1;
}

// Multiple elements with different lifetimes. In this case, it
// would be fine for both to have the same lifetime `'a`, but
// in more complex cases, different lifetimes may be required.
fn print_multi<'a, 'b>(x: &'a i32, y: &'b i32) {
    println!("`print_multi`: x is {}, y is {}", x, y);
}

// Returning references that have been passed in is acceptable.
// However, the correct lifetime must be returned.
fn pass_x<'a, 'b>(x: &'a i32, _: &'b i32) -> &'a i32 { x }

//fn invalid_output<'a>() -> &'a String { &String::from("foo") }
// The above is invalid: `'a` must live longer than the function.
// Here, `&String::from("foo")` would create a `String`, followed by a
// reference. Then the data is dropped upon exiting the scope, leaving
// a reference to invalid data to be returned.

fn main() {
    let x = 7;
    let y = 9;
    
    print_one(&x);
    print_multi(&x, &y);
    
    let z = pass_x(&x, &y);
    print_one(z);

    let mut t = 3;
    add_one(&mut t);
    print_one(&t);
}

See also:

functions

Methods

Methods are annotated similarly to functions:

struct Owner(i32);

impl Owner {
    // Annotate lifetimes as in a standalone function.
    fn add_one<'a>(&'a mut self) { self.0 += 1; }
    fn print<'a>(&'a self) {
        println!("`print`: {}", self.0);
    }
}

fn main() {
    let mut owner = Owner(18);

    owner.add_one();
    owner.print();
}

See also:

methods

Structs

Annotation of lifetimes in structures are also similar to functions:

// A type `Borrowed` which houses a reference to an
// `i32`. The reference to `i32` must outlive `Borrowed`.
#[derive(Debug)]
struct Borrowed<'a>(&'a i32);

// Similarly, both references here must outlive this structure.
#[derive(Debug)]
struct NamedBorrowed<'a> {
    x: &'a i32,
    y: &'a i32,
}

// An enum which is either an `i32` or a reference to one.
#[derive(Debug)]
enum Either<'a> {
    Num(i32),
    Ref(&'a i32),
}

fn main() {
    let x = 18;
    let y = 15;

    let single = Borrowed(&x);
    let double = NamedBorrowed { x: &x, y: &y };
    let reference = Either::Ref(&x);
    let number    = Either::Num(y);

    println!("x is borrowed in {:?}", single);
    println!("x and y are borrowed in {:?}", double);
    println!("x is borrowed in {:?}", reference);
    println!("y is *not* borrowed in {:?}", number);
}

See also:

structs

Traits

Annotation of lifetimes in trait methods basically are similar to functions. Note that impl may have annotation of lifetimes too.

// A struct with annotation of lifetimes.
#[derive(Debug)]
struct Borrowed<'a> {
    x: &'a i32,
}

// Annotate lifetimes to impl.
impl<'a> Default for Borrowed<'a> {
    fn default() -> Self {
        Self {
            x: &10,
        }
    }
}

fn main() {
    let b: Borrowed = Default::default();
    println!("b is {:?}", b);
}

See also:

traits

Bounds

Just like generic types can be bounded, lifetimes (themselves generic) use bounds as well. The : character has a slightly different meaning here, but + is the same. Note how the following read:

  1. T: 'a: All references in T must outlive lifetime 'a.
  2. T: Trait + 'a: Type T must implement trait Trait and all references in T must outlive 'a.

The example below shows the above syntax in action used after keyword where:

use std::fmt::Debug; // Trait to bound with.

#[derive(Debug)]
struct Ref<'a, T: 'a>(&'a T);
// `Ref` contains a reference to a generic type `T` that has
// an unknown lifetime `'a`. `T` is bounded such that any
// *references* in `T` must outlive `'a`. Additionally, the lifetime
// of `Ref` may not exceed `'a`.

// A generic function which prints using the `Debug` trait.
fn print<T>(t: T) where
    T: Debug {
    println!("`print`: t is {:?}", t);
}

// Here a reference to `T` is taken where `T` implements
// `Debug` and all *references* in `T` outlive `'a`. In
// addition, `'a` must outlive the function.
fn print_ref<'a, T>(t: &'a T) where
    T: Debug + 'a {
    println!("`print_ref`: t is {:?}", t);
}

fn main() {
    let x = 7;
    let ref_x = Ref(&x);

    print_ref(&ref_x);
    print(ref_x);
}

See also:

generics, bounds in generics, and multiple bounds in generics

Coercion

A longer lifetime can be coerced into a shorter one so that it works inside a scope it normally wouldn't work in. This comes in the form of inferred coercion by the Rust compiler, and also in the form of declaring a lifetime difference:

// Here, Rust infers a lifetime that is as short as possible.
// The two references are then coerced to that lifetime.
fn multiply<'a>(first: &'a i32, second: &'a i32) -> i32 {
    first * second
}

// `<'a: 'b, 'b>` reads as lifetime `'a` is at least as long as `'b`.
// Here, we take in an `&'a i32` and return a `&'b i32` as a result of coercion.
fn choose_first<'a: 'b, 'b>(first: &'a i32, _: &'b i32) -> &'b i32 {
    first
}

fn main() {
    let first = 2; // Longer lifetime
    
    {
        let second = 3; // Shorter lifetime
        
        println!("The product is {}", multiply(&first, &second));
        println!("{} is the first", choose_first(&first, &second));
    };
}

Static

Rust has a few reserved lifetime names. One of those is 'static. You might encounter it in two situations:

// A reference with 'static lifetime:
let s: &'static str = "hello world";

// 'static as part of a trait bound:
fn generic<T>(x: T) where T: 'static {}

Both are related but subtly different and this is a common source for confusion when learning Rust. Here are some examples for each situation:

Reference lifetime

As a reference lifetime 'static indicates that the data pointed to by the reference lives for the entire lifetime of the running program. It can still be coerced to a shorter lifetime.

There are two ways to make a variable with 'static lifetime, and both are stored in the read-only memory of the binary:

  • Make a constant with the static declaration.
  • Make a string literal which has type: &'static str.

See the following example for a display of each method:

// Make a constant with `'static` lifetime.
static NUM: i32 = 18;

// Returns a reference to `NUM` where its `'static`
// lifetime is coerced to that of the input argument.
fn coerce_static<'a>(_: &'a i32) -> &'a i32 {
    &NUM
}

fn main() {
    {
        // Make a `string` literal and print it:
        let static_string = "I'm in read-only memory";
        println!("static_string: {}", static_string);

        // When `static_string` goes out of scope, the reference
        // can no longer be used, but the data remains in the binary.
    }

    {
        // Make an integer to use for `coerce_static`:
        let lifetime_num = 9;

        // Coerce `NUM` to lifetime of `lifetime_num`:
        let coerced_static = coerce_static(&lifetime_num);

        println!("coerced_static: {}", coerced_static);
    }

    println!("NUM: {} stays accessible!", NUM);
}

Trait bound

As a trait bound, it means the type does not contain any non-static references. Eg. the receiver can hold on to the type for as long as they want and it will never become invalid until they drop it.

It's important to understand this means that any owned data always passes a 'static lifetime bound, but a reference to that owned data generally does not:

use std::fmt::Debug;

fn print_it( input: impl Debug + 'static ) {
    println!( "'static value passed in is: {:?}", input );
}

fn main() {
    // i is owned and contains no references, thus it's 'static:
    let i = 5;
    print_it(i);

    // oops, &i only has the lifetime defined by the scope of
    // main(), so it's not 'static:
    print_it(&i);
}

The compiler will tell you:

error[E0597]: `i` does not live long enough
  --> src/lib.rs:15:15
   |
15 |     print_it(&i);
   |     ---------^^--
   |     |         |
   |     |         borrowed value does not live long enough
   |     argument requires that `i` is borrowed for `'static`
16 | }
   | - `i` dropped here while still borrowed

See also:

'static constants

Elision

Some lifetime patterns are overwhelmingly common and so the borrow checker will allow you to omit them to save typing and to improve readability. This is known as elision. Elision exists in Rust solely because these patterns are common.

The following code shows a few examples of elision. For a more comprehensive description of elision, see lifetime elision in the book.

// `elided_input` and `annotated_input` essentially have identical signatures
// because the lifetime of `elided_input` is inferred by the compiler:
fn elided_input(x: &i32) {
    println!("`elided_input`: {}", x);
}

fn annotated_input<'a>(x: &'a i32) {
    println!("`annotated_input`: {}", x);
}

// Similarly, `elided_pass` and `annotated_pass` have identical signatures
// because the lifetime is added implicitly to `elided_pass`:
fn elided_pass(x: &i32) -> &i32 { x }

fn annotated_pass<'a>(x: &'a i32) -> &'a i32 { x }

fn main() {
    let x = 3;

    elided_input(&x);
    annotated_input(&x);

    println!("`elided_pass`: {}", elided_pass(&x));
    println!("`annotated_pass`: {}", annotated_pass(&x));
}

See also:

elision

Traits

A trait is a collection of methods defined for an unknown type: Self. They can access other methods declared in the same trait.

Traits can be implemented for any data type. In the example below, we define Animal, a group of methods. The Animal trait is then implemented for the Sheep data type, allowing the use of methods from Animal with a Sheep:

struct Sheep { naked: bool, name: &'static str }

trait Animal {
    // Associated function signature; `Self` refers to the implementor type.
    fn new(name: &'static str) -> Self;

    // Method signatures; these will return a string.
    fn name(&self) -> &'static str;
    fn noise(&self) -> &'static str;

    // Traits can provide default method definitions.
    fn talk(&self) {
        println!("{} says {}", self.name(), self.noise());
    }
}

impl Sheep {
    fn is_naked(&self) -> bool {
        self.naked
    }

    fn shear(&mut self) {
        if self.is_naked() {
            // Implementor methods can use the implementor's trait methods.
            println!("{} is already naked...", self.name());
        } else {
            println!("{} gets a haircut!", self.name);

            self.naked = true;
        }
    }
}

// Implement the `Animal` trait for `Sheep`.
impl Animal for Sheep {
    // `Self` is the implementor type: `Sheep`.
    fn new(name: &'static str) -> Sheep {
        Sheep { name: name, naked: false }
    }

    fn name(&self) -> &'static str {
        self.name
    }

    fn noise(&self) -> &'static str {
        if self.is_naked() {
            "baaaaah?"
        } else {
            "baaaaah!"
        }
    }
    
    // Default trait methods can be overridden.
    fn talk(&self) {
        // For example, we can add some quiet contemplation.
        println!("{} pauses briefly... {}", self.name, self.noise());
    }
}

fn main() {
    // Type annotation is necessary in this case.
    let mut dolly: Sheep = Animal::new("Dolly");
    // TODO ^ Try removing the type annotations.

    dolly.talk();
    dolly.shear();
    dolly.talk();
}

Derive

The compiler is capable of providing basic implementations for some traits via the #[derive] attribute. These traits can still be manually implemented if a more complex behavior is required.

The following is a list of derivable traits:

  • Comparison traits: Eq, PartialEq, Ord, PartialOrd.
  • Clone, to create T from &T via a copy.
  • Copy, to give a type 'copy semantics' instead of 'move semantics'.
  • Hash, to compute a hash from &T.
  • Default, to create an empty instance of a data type.
  • Debug, to format a value using the {:?} formatter.
// `Centimeters`, a tuple struct that can be compared
#[derive(PartialEq, PartialOrd)]
struct Centimeters(f64);

// `Inches`, a tuple struct that can be printed
#[derive(Debug)]
struct Inches(i32);

impl Inches {
    fn to_centimeters(&self) -> Centimeters {
        let &Inches(inches) = self;

        Centimeters(inches as f64 * 2.54)
    }
}

// `Seconds`, a tuple struct with no additional attributes
struct Seconds(i32);

fn main() {
    let _one_second = Seconds(1);

    // Error: `Seconds` can't be printed; it doesn't implement the `Debug` trait
    //println!("One second looks like: {:?}", _one_second);
    // TODO ^ Try uncommenting this line

    // Error: `Seconds` can't be compared; it doesn't implement the `PartialEq` trait
    //let _this_is_true = (_one_second == _one_second);
    // TODO ^ Try uncommenting this line

    let foot = Inches(12);

    println!("One foot equals {:?}", foot);

    let meter = Centimeters(100.0);

    let cmp =
        if foot.to_centimeters() < meter {
            "smaller"
        } else {
            "bigger"
        };

    println!("One foot is {} than one meter.", cmp);
}

See also:

derive

Returning Traits with dyn

The Rust compiler needs to know how much space every function's return type requires. This means all your functions have to return a concrete type. Unlike other languages, if you have a trait like Animal, you can't write a function that returns Animal, because its different implementations will need different amounts of memory.

However, there's an easy workaround. Instead of returning a trait object directly, our functions return a Box which contains some Animal. A box is just a reference to some memory in the heap. Because a reference has a statically-known size, and the compiler can guarantee it points to a heap-allocated Animal, we can return a trait from our function!

Rust tries to be as explicit as possible whenever it allocates memory on the heap. So if your function returns a pointer-to-trait-on-heap in this way, you need to write the return type with the dyn keyword, e.g. Box<dyn Animal>.

struct Sheep {}
struct Cow {}

trait Animal {
    // Instance method signature
    fn noise(&self) -> &'static str;
}

// Implement the `Animal` trait for `Sheep`.
impl Animal for Sheep {
    fn noise(&self) -> &'static str {
        "baaaaah!"
    }
}

// Implement the `Animal` trait for `Cow`.
impl Animal for Cow {
    fn noise(&self) -> &'static str {
        "moooooo!"
    }
}

// Returns some struct that implements Animal, but we don't know which one at compile time.
fn random_animal(random_number: f64) -> Box<dyn Animal> {
    if random_number < 0.5 {
        Box::new(Sheep {})
    } else {
        Box::new(Cow {})
    }
}

fn main() {
    let random_number = 0.234;
    let animal = random_animal(random_number);
    println!("You've randomly chosen an animal, and it says {}", animal.noise());
}

Operator Overloading

In Rust, many of the operators can be overloaded via traits. That is, some operators can be used to accomplish different tasks based on their input arguments. This is possible because operators are syntactic sugar for method calls. For example, the + operator in a + b calls the add method (as in a.add(b)). This add method is part of the Add trait. Hence, the + operator can be used by any implementor of the Add trait.

A list of the traits, such as Add, that overload operators can be found in core::ops.

use std::ops;

struct Foo;
struct Bar;

#[derive(Debug)]
struct FooBar;

#[derive(Debug)]
struct BarFoo;

// The `std::ops::Add` trait is used to specify the functionality of `+`.
// Here, we make `Add<Bar>` - the trait for addition with a RHS of type `Bar`.
// The following block implements the operation: Foo + Bar = FooBar
impl ops::Add<Bar> for Foo {
    type Output = FooBar;

    fn add(self, _rhs: Bar) -> FooBar {
        println!("> Foo.add(Bar) was called");

        FooBar
    }
}

// By reversing the types, we end up implementing non-commutative addition.
// Here, we make `Add<Foo>` - the trait for addition with a RHS of type `Foo`.
// This block implements the operation: Bar + Foo = BarFoo
impl ops::Add<Foo> for Bar {
    type Output = BarFoo;

    fn add(self, _rhs: Foo) -> BarFoo {
        println!("> Bar.add(Foo) was called");

        BarFoo
    }
}

fn main() {
    println!("Foo + Bar = {:?}", Foo + Bar);
    println!("Bar + Foo = {:?}", Bar + Foo);
}

See Also

Add, Syntax Index

Drop

The Drop trait only has one method: drop, which is called automatically when an object goes out of scope. The main use of the Drop trait is to free the resources that the implementor instance owns.

Box, Vec, String, File, and Process are some examples of types that implement the Drop trait to free resources. The Drop trait can also be manually implemented for any custom data type.

The following example adds a print to console to the drop function to announce when it is called.

struct Droppable {
    name: &'static str,
}

// This trivial implementation of `drop` adds a print to console.
impl Drop for Droppable {
    fn drop(&mut self) {
        println!("> Dropping {}", self.name);
    }
}

fn main() {
    let _a = Droppable { name: "a" };

    // block A
    {
        let _b = Droppable { name: "b" };

        // block B
        {
            let _c = Droppable { name: "c" };
            let _d = Droppable { name: "d" };

            println!("Exiting block B");
        }
        println!("Just exited block B");

        println!("Exiting block A");
    }
    println!("Just exited block A");

    // Variable can be manually dropped using the `drop` function
    drop(_a);
    // TODO ^ Try commenting this line

    println!("end of the main function");

    // `_a` *won't* be `drop`ed again here, because it already has been
    // (manually) `drop`ed
}

Iterators

The Iterator trait is used to implement iterators over collections such as arrays.

The trait requires only a method to be defined for the next element, which may be manually defined in an impl block or automatically defined (as in arrays and ranges).

As a point of convenience for common situations, the for construct turns some collections into iterators using the .into_iter() method.

struct Fibonacci {
    curr: u32,
    next: u32,
}

// Implement `Iterator` for `Fibonacci`.
// The `Iterator` trait only requires a method to be defined for the `next` element.
impl Iterator for Fibonacci {
    // We can refer to this type using Self::Item
    type Item = u32;

    // Here, we define the sequence using `.curr` and `.next`.
    // The return type is `Option<T>`:
    //     * When the `Iterator` is finished, `None` is returned.
    //     * Otherwise, the next value is wrapped in `Some` and returned.
    // We use Self::Item in the return type, so we can change
    // the type without having to update the function signatures.
    fn next(&mut self) -> Option<Self::Item> {
        let current = self.curr;

        self.curr = self.next;
        self.next = current + self.next;

        // Since there's no endpoint to a Fibonacci sequence, the `Iterator` 
        // will never return `None`, and `Some` is always returned.
        Some(current)
    }
}

// Returns a Fibonacci sequence generator
fn fibonacci() -> Fibonacci {
    Fibonacci { curr: 0, next: 1 }
}

fn main() {
    // `0..3` is an `Iterator` that generates: 0, 1, and 2.
    let mut sequence = 0..3;

    println!("Four consecutive `next` calls on 0..3");
    println!("> {:?}", sequence.next());
    println!("> {:?}", sequence.next());
    println!("> {:?}", sequence.next());
    println!("> {:?}", sequence.next());

    // `for` works through an `Iterator` until it returns `None`.
    // Each `Some` value is unwrapped and bound to a variable (here, `i`).
    println!("Iterate through 0..3 using `for`");
    for i in 0..3 {
        println!("> {}", i);
    }

    // The `take(n)` method reduces an `Iterator` to its first `n` terms.
    println!("The first four terms of the Fibonacci sequence are: ");
    for i in fibonacci().take(4) {
        println!("> {}", i);
    }

    // The `skip(n)` method shortens an `Iterator` by dropping its first `n` terms.
    println!("The next four terms of the Fibonacci sequence are: ");
    for i in fibonacci().skip(4).take(4) {
        println!("> {}", i);
    }

    let array = [1u32, 3, 3, 7];

    // The `iter` method produces an `Iterator` over an array/slice.
    println!("Iterate the following array {:?}", &array);
    for i in array.iter() {
        println!("> {}", i);
    }
}

impl Trait

impl Trait can be used in two locations:

  1. as an argument type
  2. as a return type

As an argument type

If your function is generic over a trait but you don't mind the specific type, you can simplify the function declaration using impl Trait as the type of the argument.

For example, consider the following code:

fn parse_csv_document<R: std::io::BufRead>(src: R) -> std::io::Result<Vec<Vec<String>>> {
    src.lines()
        .map(|line| {
            // For each line in the source
            line.map(|line| {
                // If the line was read successfully, process it, if not, return the error
                line.split(',') // Split the line separated by commas
                    .map(|entry| String::from(entry.trim())) // Remove leading and trailing whitespace
                    .collect() // Collect all strings in a row into a Vec<String>
            })
        })
        .collect() // Collect all lines into a Vec<Vec<String>>
}

parse_csv_document is generic, allowing it to take any type which implements BufRead, such as BufReader<File> or [u8], but it's not important what type R is, and R is only used to declare the type of src, so the function can also be written as:

fn parse_csv_document(src: impl std::io::BufRead) -> std::io::Result<Vec<Vec<String>>> {
    src.lines()
        .map(|line| {
            // For each line in the source
            line.map(|line| {
                // If the line was read successfully, process it, if not, return the error
                line.split(',') // Split the line separated by commas
                    .map(|entry| String::from(entry.trim())) // Remove leading and trailing whitespace
                    .collect() // Collect all strings in a row into a Vec<String>
            })
        })
        .collect() // Collect all lines into a Vec<Vec<String>>
}

Note that using impl Trait as an argument type means that you cannot explicitly state what form of the function you use, i.e. parse_csv_document::<std::io::Empty>(std::io::empty()) will not work with the second example.

As a return type

If your function returns a type that implements MyTrait, you can write its return type as -> impl MyTrait. This can help simplify your type signatures quite a lot!

use std::iter;
use std::vec::IntoIter;

// This function combines two `Vec<i32>` and returns an iterator over it.
// Look how complicated its return type is!
fn combine_vecs_explicit_return_type(
    v: Vec<i32>,
    u: Vec<i32>,
) -> iter::Cycle<iter::Chain<IntoIter<i32>, IntoIter<i32>>> {
    v.into_iter().chain(u.into_iter()).cycle()
}

// This is the exact same function, but its return type uses `impl Trait`.
// Look how much simpler it is!
fn combine_vecs(
    v: Vec<i32>,
    u: Vec<i32>,
) -> impl Iterator<Item=i32> {
    v.into_iter().chain(u.into_iter()).cycle()
}

fn main() {
    let v1 = vec![1, 2, 3];
    let v2 = vec![4, 5];
    let mut v3 = combine_vecs(v1, v2);
    assert_eq!(Some(1), v3.next());
    assert_eq!(Some(2), v3.next());
    assert_eq!(Some(3), v3.next());
    assert_eq!(Some(4), v3.next());
    assert_eq!(Some(5), v3.next());
    println!("all done");
}

More importantly, some Rust types can't be written out. For example, every closure has its own unnamed concrete type. Before impl Trait syntax, you had to allocate on the heap in order to return a closure. But now you can do it all statically, like this:

// Returns a function that adds `y` to its input
fn make_adder_function(y: i32) -> impl Fn(i32) -> i32 {
    let closure = move |x: i32| { x + y };
    closure
}

fn main() {
    let plus_one = make_adder_function(1);
    assert_eq!(plus_one(2), 3);
}

You can also use impl Trait to return an iterator that uses map or filter closures! This makes using map and filter easier. Because closure types don't have names, you can't write out an explicit return type if your function returns iterators with closures. But with impl Trait you can do this easily:

fn double_positives<'a>(numbers: &'a Vec<i32>) -> impl Iterator<Item = i32> + 'a {
    numbers
        .iter()
        .filter(|x| x > &&0)
        .map(|x| x * 2)
}

fn main() {
    let singles = vec![-3, -2, 2, 3];
    let doubles = double_positives(&singles);
    assert_eq!(doubles.collect::<Vec<i32>>(), vec![4, 6]);
}

Clone

When dealing with resources, the default behavior is to transfer them during assignments or function calls. However, sometimes we need to make a copy of the resource as well.

The Clone trait helps us do exactly this. Most commonly, we can use the .clone() method defined by the Clone trait.

// A unit struct without resources
#[derive(Debug, Clone, Copy)]
struct Unit;

// A tuple struct with resources that implements the `Clone` trait
#[derive(Clone, Debug)]
struct Pair(Box<i32>, Box<i32>);

fn main() {
    // Instantiate `Unit`
    let unit = Unit;
    // Copy `Unit`, there are no resources to move
    let copied_unit = unit;

    // Both `Unit`s can be used independently
    println!("original: {:?}", unit);
    println!("copy: {:?}", copied_unit);

    // Instantiate `Pair`
    let pair = Pair(Box::new(1), Box::new(2));
    println!("original: {:?}", pair);

    // Move `pair` into `moved_pair`, moves resources
    let moved_pair = pair;
    println!("moved: {:?}", moved_pair);

    // Error! `pair` has lost its resources
    //println!("original: {:?}", pair);
    // TODO ^ Try uncommenting this line

    // Clone `moved_pair` into `cloned_pair` (resources are included)
    let cloned_pair = moved_pair.clone();
    // Drop the original pair using std::mem::drop
    drop(moved_pair);

    // Error! `moved_pair` has been dropped
    //println!("copy: {:?}", moved_pair);
    // TODO ^ Try uncommenting this line

    // The result from .clone() can still be used!
    println!("clone: {:?}", cloned_pair);
}

Supertraits

Rust doesn't have "inheritance", but you can define a trait as being a superset of another trait. For example:

trait Person {
    fn name(&self) -> String;
}

// Person is a supertrait of Student.
// Implementing Student requires you to also impl Person.
trait Student: Person {
    fn university(&self) -> String;
}

trait Programmer {
    fn fav_language(&self) -> String;
}

// CompSciStudent (computer science student) is a subtrait of both Programmer 
// and Student. Implementing CompSciStudent requires you to impl both supertraits.
trait CompSciStudent: Programmer + Student {
    fn git_username(&self) -> String;
}

fn comp_sci_student_greeting(student: &dyn CompSciStudent) -> String {
    format!(
        "My name is {} and I attend {}. My favorite language is {}. My Git username is {}",
        student.name(),
        student.university(),
        student.fav_language(),
        student.git_username()
    )
}

fn main() {}

See also:

The Rust Programming Language chapter on supertraits

Disambiguating overlapping traits

A type can implement many different traits. What if two traits both require the same name? For example, many traits might have a method named get(). They might even have different return types!

Good news: because each trait implementation gets its own impl block, it's clear which trait's get method you're implementing.

What about when it comes time to call those methods? To disambiguate between them, we have to use Fully Qualified Syntax.

trait UsernameWidget {
    // Get the selected username out of this widget
    fn get(&self) -> String;
}

trait AgeWidget {
    // Get the selected age out of this widget
    fn get(&self) -> u8;
}

// A form with both a UsernameWidget and an AgeWidget
struct Form {
    username: String,
    age: u8,
}

impl UsernameWidget for Form {
    fn get(&self) -> String {
        self.username.clone()
    }
}

impl AgeWidget for Form {
    fn get(&self) -> u8 {
        self.age
    }
}

fn main() {
    let form = Form {
        username: "rustacean".to_owned(),
        age: 28,
    };

    // If you uncomment this line, you'll get an error saying
    // "multiple `get` found". Because, after all, there are multiple methods
    // named `get`.
    // println!("{}", form.get());

    let username = <Form as UsernameWidget>::get(&form);
    assert_eq!("rustacean".to_owned(), username);
    let age = <Form as AgeWidget>::get(&form);
    assert_eq!(28, age);
}

See also:

The Rust Programming Language chapter on Fully Qualified syntax

macro_rules!

Rust provides a powerful macro system that allows metaprogramming. As you've seen in previous chapters, macros look like functions, except that their name ends with a bang !, but instead of generating a function call, macros are expanded into source code that gets compiled with the rest of the program. However, unlike macros in C and other languages, Rust macros are expanded into abstract syntax trees, rather than string preprocessing, so you don't get unexpected precedence bugs.

Macros are created using the macro_rules! macro:

// This is a simple macro named `say_hello`.
macro_rules! say_hello {
    // `()` indicates that the macro takes no argument.
    () => {
        // The macro will expand into the contents of this block.
        println!("Hello!")
    };
}

fn main() {
    // This call will expand into `println!("Hello")`
    say_hello!()
}

So why are macros useful?

  1. Don't repeat yourself. There are many cases where you may need similar functionality in multiple places but with different types. Often, writing a macro is a useful way to avoid repeating code. (More on this later)

  2. Domain-specific languages. Macros allow you to define special syntax for a specific purpose. (More on this later)

  3. Variadic interfaces. Sometimes you want to define an interface that takes a variable number of arguments. An example is println! which could take any number of arguments, depending on the format string. (More on this later)

Syntax

In following subsections, we will show how to define macros in Rust. There are three basic ideas:

Designators

The arguments of a macro are prefixed by a dollar sign $ and type annotated with a designator:

macro_rules! create_function {
    // This macro takes an argument of designator `ident` and
    // creates a function named `$func_name`.
    // The `ident` designator is used for variable/function names.
    ($func_name:ident) => {
        fn $func_name() {
            // The `stringify!` macro converts an `ident` into a string.
            println!("You called {:?}()",
                     stringify!($func_name));
        }
    };
}

// Create functions named `foo` and `bar` with the above macro.
create_function!(foo);
create_function!(bar);

macro_rules! print_result {
    // This macro takes an expression of type `expr` and prints
    // it as a string along with its result.
    // The `expr` designator is used for expressions.
    ($expression:expr) => {
        // `stringify!` will convert the expression *as it is* into a string.
        println!("{:?} = {:?}",
                 stringify!($expression),
                 $expression);
    };
}

fn main() {
    foo();
    bar();

    print_result!(1u32 + 1);

    // Recall that blocks are expressions too!
    print_result!({
        let x = 1u32;

        x * x + 2 * x - 1
    });
}

These are some of the available designators:

  • block
  • expr is used for expressions
  • ident is used for variable/function names
  • item
  • literal is used for literal constants
  • pat (pattern)
  • path
  • stmt (statement)
  • tt (token tree)
  • ty (type)
  • vis (visibility qualifier)

For a complete list, see the Rust Reference.

Overload

Macros can be overloaded to accept different combinations of arguments. In that regard, macro_rules! can work similarly to a match block:

// `test!` will compare `$left` and `$right`
// in different ways depending on how you invoke it:
macro_rules! test {
    // Arguments don't need to be separated by a comma.
    // Any template can be used!
    ($left:expr; and $right:expr) => {
        println!("{:?} and {:?} is {:?}",
                 stringify!($left),
                 stringify!($right),
                 $left && $right)
    };
    // ^ each arm must end with a semicolon.
    ($left:expr; or $right:expr) => {
        println!("{:?} or {:?} is {:?}",
                 stringify!($left),
                 stringify!($right),
                 $left || $right)
    };
}

fn main() {
    test!(1i32 + 1 == 2i32; and 2i32 * 2 == 4i32);
    test!(true; or false);
}

Repeat

Macros can use + in the argument list to indicate that an argument may repeat at least once, or *, to indicate that the argument may repeat zero or more times.

In the following example, surrounding the matcher with $(...),+ will match one or more expression, separated by commas. Also note that the semicolon is optional on the last case.

// `find_min!` will calculate the minimum of any number of arguments.
macro_rules! find_min {
    // Base case:
    ($x:expr) => ($x);
    // `$x` followed by at least one `$y,`
    ($x:expr, $($y:expr),+) => (
        // Call `find_min!` on the tail `$y`
        std::cmp::min($x, find_min!($($y),+))
    )
}

fn main() {
    println!("{}", find_min!(1));
    println!("{}", find_min!(1 + 2, 2));
    println!("{}", find_min!(5, 2 * 3, 4));
}

DRY (Don't Repeat Yourself)

Macros allow writing DRY code by factoring out the common parts of functions and/or test suites. Here is an example that implements and tests the +=, *= and -= operators on Vec<T>:

use std::ops::{Add, Mul, Sub};

macro_rules! assert_equal_len {
    // The `tt` (token tree) designator is used for
    // operators and tokens.
    ($a:expr, $b:expr, $func:ident, $op:tt) => {
        assert!($a.len() == $b.len(),
                "{:?}: dimension mismatch: {:?} {:?} {:?}",
                stringify!($func),
                ($a.len(),),
                stringify!($op),
                ($b.len(),));
    };
}

macro_rules! op {
    ($func:ident, $bound:ident, $op:tt, $method:ident) => {
        fn $func<T: $bound<T, Output=T> + Copy>(xs: &mut Vec<T>, ys: &Vec<T>) {
            assert_equal_len!(xs, ys, $func, $op);

            for (x, y) in xs.iter_mut().zip(ys.iter()) {
                *x = $bound::$method(*x, *y);
                // *x = x.$method(*y);
            }
        }
    };
}

// Implement `add_assign`, `mul_assign`, and `sub_assign` functions.
op!(add_assign, Add, +=, add);
op!(mul_assign, Mul, *=, mul);
op!(sub_assign, Sub, -=, sub);

mod test {
    use std::iter;
    macro_rules! test {
        ($func:ident, $x:expr, $y:expr, $z:expr) => {
            #[test]
            fn $func() {
                for size in 0usize..10 {
                    let mut x: Vec<_> = iter::repeat($x).take(size).collect();
                    let y: Vec<_> = iter::repeat($y).take(size).collect();
                    let z: Vec<_> = iter::repeat($z).take(size).collect();

                    super::$func(&mut x, &y);

                    assert_eq!(x, z);
                }
            }
        };
    }

    // Test `add_assign`, `mul_assign`, and `sub_assign`.
    test!(add_assign, 1u32, 2u32, 3u32);
    test!(mul_assign, 2u32, 3u32, 6u32);
    test!(sub_assign, 3u32, 2u32, 1u32);
}
$ rustc --test dry.rs && ./dry
running 3 tests
test test::mul_assign ... ok
test test::add_assign ... ok
test test::sub_assign ... ok

test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured

Domain Specific Languages (DSLs)

A DSL is a mini "language" embedded in a Rust macro. It is completely valid Rust because the macro system expands into normal Rust constructs, but it looks like a small language. This allows you to define concise or intuitive syntax for some special functionality (within bounds).

Suppose that I want to define a little calculator API. I would like to supply an expression and have the output printed to console.

macro_rules! calculate {
    (eval $e:expr) => {
        {
            let val: usize = $e; // Force types to be integers
            println!("{} = {}", stringify!{$e}, val);
        }
    };
}

fn main() {
    calculate! {
        eval 1 + 2 // hehehe `eval` is _not_ a Rust keyword!
    }

    calculate! {
        eval (1 + 2) * (3 / 4)
    }
}

Output:

1 + 2 = 3
(1 + 2) * (3 / 4) = 0

This was a very simple example, but much more complex interfaces have been developed, such as lazy_static or clap.

Also, note the two pairs of braces in the macro. The outer ones are part of the syntax of macro_rules!, in addition to () or [].

Variadic Interfaces

A variadic interface takes an arbitrary number of arguments. For example, println! can take an arbitrary number of arguments, as determined by the format string.

We can extend our calculate! macro from the previous section to be variadic:

macro_rules! calculate {
    // The pattern for a single `eval`
    (eval $e:expr) => {
        {
            let val: usize = $e; // Force types to be integers
            println!("{} = {}", stringify!{$e}, val);
        }
    };

    // Decompose multiple `eval`s recursively
    (eval $e:expr, $(eval $es:expr),+) => {{
        calculate! { eval $e }
        calculate! { $(eval $es),+ }
    }};
}

fn main() {
    calculate! { // Look ma! Variadic `calculate!`!
        eval 1 + 2,
        eval 3 + 4,
        eval (2 * 3) + 1
    }
}

Output:

1 + 2 = 3
3 + 4 = 7
(2 * 3) + 1 = 7

Error handling

Error handling is the process of handling the possibility of failure. For example, failing to read a file and then continuing to use that bad input would clearly be problematic. Noticing and explicitly managing those errors saves the rest of the program from various pitfalls.

There are various ways to deal with errors in Rust, which are described in the following subchapters. They all have more or less subtle differences and different use cases. As a rule of thumb:

An explicit panic is mainly useful for tests and dealing with unrecoverable errors. For prototyping it can be useful, for example when dealing with functions that haven't been implemented yet, but in those cases the more descriptive unimplemented is better. In tests panic is a reasonable way to explicitly fail.

The Option type is for when a value is optional or when the lack of a value is not an error condition. For example the parent of a directory - / and C: don't have one. When dealing with Options, unwrap is fine for prototyping and cases where it's absolutely certain that there is guaranteed to be a value. However expect is more useful since it lets you specify an error message in case something goes wrong anyway.

When there is a chance that things do go wrong and the caller has to deal with the problem, use Result. You can unwrap and expect them as well (please don't do that unless it's a test or quick prototype).

For a more rigorous discussion of error handling, refer to the error handling section in the official book.

panic

The simplest error handling mechanism we will see is panic. It prints an error message, starts unwinding the stack, and usually exits the program. Here, we explicitly call panic on our error condition:

fn drink(beverage: &str) {
    // You shouldn't drink too much sugary beverages.
    if beverage == "lemonade" { panic!("AAAaaaaa!!!!"); }

    println!("Some refreshing {} is all I need.", beverage);
}

fn main() {
    drink("water");
    drink("lemonade");
    drink("still water");
}

The first call to drink works. The second panics and thus the third is never called.

abort and unwind

The previous section illustrates the error handling mechanism panic. Different code paths can be conditionally compiled based on the panic setting. The current values available are unwind and abort.

Building on the prior lemonade example, we explicitly use the panic strategy to exercise different lines of code.


fn drink(beverage: &str) {
   // You shouldn't drink too much sugary beverages.
    if beverage == "lemonade" {
        if cfg!(panic="abort"){ println!("This is not your party. Run!!!!");}
        else{ println!("Spit it out!!!!");}
    }
    else{ println!("Some refreshing {} is all I need.", beverage); }
}

fn main() {
    drink("water");
    drink("lemonade");
}

Here is another example focusing on rewriting drink() and explicitly use the unwind keyword.


#[cfg(panic = "unwind")]
fn ah(){ println!("Spit it out!!!!");}

#[cfg(not(panic="unwind"))]
fn ah(){ println!("This is not your party. Run!!!!");}

fn drink(beverage: &str){
    if beverage == "lemonade"{ ah();}
    else{println!("Some refreshing {} is all I need.", beverage);}
}

fn main() {
    drink("water");
    drink("lemonade");
}

The panic strategy can be set from the command line by using abort or unwind.

rustc  lemonade.rs -C panic=abort

Option & unwrap

In the last example, we showed that we can induce program failure at will. We told our program to panic if we drink a sugary lemonade. But what if we expect some drink but don't receive one? That case would be just as bad, so it needs to be handled!

We could test this against the null string ("") as we do with a lemonade. Since we're using Rust, let's instead have the compiler point out cases where there's no drink.

An enum called Option<T> in the std library is used when absence is a possibility. It manifests itself as one of two "options":

  • Some(T): An element of type T was found
  • None: No element was found

These cases can either be explicitly handled via match or implicitly with unwrap. Implicit handling will either return the inner element or panic.

Note that it's possible to manually customize panic with expect, but unwrap otherwise leaves us with a less meaningful output than explicit handling. In the following example, explicit handling yields a more controlled result while retaining the option to panic if desired.

// The adult has seen it all, and can handle any drink well.
// All drinks are handled explicitly using `match`.
fn give_adult(drink: Option<&str>) {
    // Specify a course of action for each case.
    match drink {
        Some("lemonade") => println!("Yuck! Too sugary."),
        Some(inner)   => println!("{}? How nice.", inner),
        None          => println!("No drink? Oh well."),
    }
}

// Others will `panic` before drinking sugary drinks.
// All drinks are handled implicitly using `unwrap`.
fn drink(drink: Option<&str>) {
    // `unwrap` returns a `panic` when it receives a `None`.
    let inside = drink.unwrap();
    if inside == "lemonade" { panic!("AAAaaaaa!!!!"); }

    println!("I love {}s!!!!!", inside);
}

fn main() {
    let water  = Some("water");
    let lemonade = Some("lemonade");
    let void  = None;

    give_adult(water);
    give_adult(lemonade);
    give_adult(void);

    let coffee = Some("coffee");
    let nothing = None;

    drink(coffee);
    drink(nothing);
}

Unpacking options with ?

You can unpack Options by using match statements, but it's often easier to use the ? operator. If x is an Option, then evaluating x? will return the underlying value if x is Some, otherwise it will terminate whatever function is being executed and return None.

fn next_birthday(current_age: Option<u8>) -> Option<String> {
	// If `current_age` is `None`, this returns `None`.
	// If `current_age` is `Some`, the inner `u8` gets assigned to `next_age`
    let next_age: u8 = current_age? + 1;
    Some(format!("Next year I will be {}", next_age))
}

You can chain many ?s together to make your code much more readable.

struct Person {
    job: Option<Job>,
}

#[derive(Clone, Copy)]
struct Job {
    phone_number: Option<PhoneNumber>,
}

#[derive(Clone, Copy)]
struct PhoneNumber {
    area_code: Option<u8>,
    number: u32,
}

impl Person {

    // Gets the area code of the phone number of the person's job, if it exists.
    fn work_phone_area_code(&self) -> Option<u8> {
        // This would need many nested `match` statements without the `?` operator.
        // It would take a lot more code - try writing it yourself and see which
        // is easier.
        self.job?.phone_number?.area_code
    }
}

fn main() {
    let p = Person {
        job: Some(Job {
            phone_number: Some(PhoneNumber {
                area_code: Some(61),
                number: 439222222,
            }),
        }),
    };

    assert_eq!(p.work_phone_area_code(), Some(61));
}

Combinators: map

match is a valid method for handling Options. However, you may eventually find heavy usage tedious, especially with operations only valid with an input. In these cases, combinators can be used to manage control flow in a modular fashion.

Option has a built in method called map(), a combinator for the simple mapping of Some -> Some and None -> None. Multiple map() calls can be chained together for even more flexibility.

In the following example, process() replaces all functions previous to it while staying compact.

#![allow(dead_code)]

#[derive(Debug)] enum Food { Apple, Carrot, Potato }

#[derive(Debug)] struct Peeled(Food);
#[derive(Debug)] struct Chopped(Food);
#[derive(Debug)] struct Cooked(Food);

// Peeling food. If there isn't any, then return `None`.
// Otherwise, return the peeled food.
fn peel(food: Option<Food>) -> Option<Peeled> {
    match food {
        Some(food) => Some(Peeled(food)),
        None       => None,
    }
}

// Chopping food. If there isn't any, then return `None`.
// Otherwise, return the chopped food.
fn chop(peeled: Option<Peeled>) -> Option<Chopped> {
    match peeled {
        Some(Peeled(food)) => Some(Chopped(food)),
        None               => None,
    }
}

// Cooking food. Here, we showcase `map()` instead of `match` for case handling.
fn cook(chopped: Option<Chopped>) -> Option<Cooked> {
    chopped.map(|Chopped(food)| Cooked(food))
}

// A function to peel, chop, and cook food all in sequence.
// We chain multiple uses of `map()` to simplify the code.
fn process(food: Option<Food>) -> Option<Cooked> {
    food.map(|f| Peeled(f))
        .map(|Peeled(f)| Chopped(f))
        .map(|Chopped(f)| Cooked(f))
}

// Check whether there's food or not before trying to eat it!
fn eat(food: Option<Cooked>) {
    match food {
        Some(food) => println!("Mmm. I love {:?}", food),
        None       => println!("Oh no! It wasn't edible."),
    }
}

fn main() {
    let apple = Some(Food::Apple);
    let carrot = Some(Food::Carrot);
    let potato = None;

    let cooked_apple = cook(chop(peel(apple)));
    let cooked_carrot = cook(chop(peel(carrot)));
    // Let's try the simpler looking `process()` now.
    let cooked_potato = process(potato);

    eat(cooked_apple);
    eat(cooked_carrot);
    eat(cooked_potato);
}

See also:

closures, Option, Option::map()

Combinators: and_then

map() was described as a chainable way to simplify match statements. However, using map() on a function that returns an Option<T> results in the nested Option<Option<T>>. Chaining multiple calls together can then become confusing. That's where another combinator called and_then(), known in some languages as flatmap, comes in.

and_then() calls its function input with the wrapped value and returns the result. If the Option is None, then it returns None instead.

In the following example, cookable_v2() results in an Option<Food>. Using map() instead of and_then() would have given an Option<Option<Food>>, which is an invalid type for eat().

#![allow(dead_code)]

#[derive(Debug)] enum Food { CordonBleu, Steak, Sushi }
#[derive(Debug)] enum Day { Monday, Tuesday, Wednesday }

// We don't have the ingredients to make Sushi.
fn have_ingredients(food: Food) -> Option<Food> {
    match food {
        Food::Sushi => None,
        _           => Some(food),
    }
}

// We have the recipe for everything except Cordon Bleu.
fn have_recipe(food: Food) -> Option<Food> {
    match food {
        Food::CordonBleu => None,
        _                => Some(food),
    }
}

// To make a dish, we need both the recipe and the ingredients.
// We can represent the logic with a chain of `match`es:
fn cookable_v1(food: Food) -> Option<Food> {
    match have_recipe(food) {
        None       => None,
        Some(food) => have_ingredients(food),
    }
}

// This can conveniently be rewritten more compactly with `and_then()`:
fn cookable_v2(food: Food) -> Option<Food> {
    have_recipe(food).and_then(have_ingredients)
}

fn eat(food: Food, day: Day) {
    match cookable_v2(food) {
        Some(food) => println!("Yay! On {:?} we get to eat {:?}.", day, food),
        None       => println!("Oh no. We don't get to eat on {:?}?", day),
    }
}

fn main() {
    let (cordon_bleu, steak, sushi) = (Food::CordonBleu, Food::Steak, Food::Sushi);

    eat(cordon_bleu, Day::Monday);
    eat(steak, Day::Tuesday);
    eat(sushi, Day::Wednesday);
}

See also:

closures, Option, and Option::and_then()

Unpacking options and defaults

There is more than one way to unpack an Option and fall back on a default if it is None. To choose the one that meets our needs, we need to consider the following:

  • do we need eager or lazy evaluation?
  • do we need to keep the original empty value intact, or modify it in place?

or() is chainable, evaluates eagerly, keeps empty value intact

or()is chainable and eagerly evaluates its argument, as is shown in the following example. Note that because or's arguments are evaluated eagerly, the variable passed to or is moved.

#[derive(Debug)] 
enum Fruit { Apple, Orange, Banana, Kiwi, Lemon }

fn main() {
    let apple = Some(Fruit::Apple);
    let orange = Some(Fruit::Orange);
    let no_fruit: Option<Fruit> = None;

    let first_available_fruit = no_fruit.or(orange).or(apple);
    println!("first_available_fruit: {:?}", first_available_fruit);
    // first_available_fruit: Some(Orange)

    // `or` moves its argument.
    // In the example above, `or(orange)` returned a `Some`, so `or(apple)` was not invoked.
    // But the variable named `apple` has been moved regardless, and cannot be used anymore.
    // println!("Variable apple was moved, so this line won't compile: {:?}", apple);
    // TODO: uncomment the line above to see the compiler error
 }

or_else() is chainable, evaluates lazily, keeps empty value intact

Another alternative is to use or_else, which is also chainable, and evaluates lazily, as is shown in the following example:

#[derive(Debug)] 
enum Fruit { Apple, Orange, Banana, Kiwi, Lemon }

fn main() {
    let apple = Some(Fruit::Apple);
    let no_fruit: Option<Fruit> = None;
    let get_kiwi_as_fallback = || {
        println!("Providing kiwi as fallback");
        Some(Fruit::Kiwi)
    };
    let get_lemon_as_fallback = || {
        println!("Providing lemon as fallback");
        Some(Fruit::Lemon)
    };

    let first_available_fruit = no_fruit
        .or_else(get_kiwi_as_fallback)
        .or_else(get_lemon_as_fallback);
    println!("first_available_fruit: {:?}", first_available_fruit);
    // Providing kiwi as fallback
    // first_available_fruit: Some(Kiwi)
}

get_or_insert() evaluates eagerly, modifies empty value in place

To make sure that an Option contains a value, we can use get_or_insert to modify it in place with a fallback value, as is shown in the following example. Note that get_or_insert eagerly evaluates its parameter, so variable apple is moved:

#[derive(Debug)]
enum Fruit { Apple, Orange, Banana, Kiwi, Lemon }

fn main() {
    let mut my_fruit: Option<Fruit> = None;
    let apple = Fruit::Apple;
    let first_available_fruit = my_fruit.get_or_insert(apple);
    println!("first_available_fruit is: {:?}", first_available_fruit);
    println!("my_fruit is: {:?}", my_fruit);
    // first_available_fruit is: Apple
    // my_fruit is: Some(Apple)
    //println!("Variable named `apple` is moved: {:?}", apple);
    // TODO: uncomment the line above to see the compiler error
}

get_or_insert_with() evaluates lazily, modifies empty value in place

Instead of explicitly providing a value to fall back on, we can pass a closure to get_or_insert_with, as follows:

#[derive(Debug)] 
enum Fruit { Apple, Orange, Banana, Kiwi, Lemon }

fn main() {
    let mut my_fruit: Option<Fruit> = None;
    let get_lemon_as_fallback = || {
        println!("Providing lemon as fallback");
        Fruit::Lemon
    };
    let first_available_fruit = my_fruit
        .get_or_insert_with(get_lemon_as_fallback);
    println!("first_available_fruit is: {:?}", first_available_fruit);
    println!("my_fruit is: {:?}", my_fruit);
    // Providing lemon as fallback
    // first_available_fruit is: Lemon
    // my_fruit is: Some(Lemon)

    // If the Option has a value, it is left unchanged, and the closure is not invoked
    let mut my_apple = Some(Fruit::Apple);
    let should_be_apple = my_apple.get_or_insert_with(get_lemon_as_fallback);
    println!("should_be_apple is: {:?}", should_be_apple);
    println!("my_apple is unchanged: {:?}", my_apple);
    // The output is a follows. Note that the closure `get_lemon_as_fallback` is not invoked
    // should_be_apple is: Apple
    // my_apple is unchanged: Some(Apple)
}

See also:

closures, get_or_insert, get_or_insert_with, ,moved variables, or, or_else

Result

Result is a richer version of the Option type that describes possible error instead of possible absence.

That is, Result<T, E> could have one of two outcomes:

  • Ok(T): An element T was found
  • Err(E): An error was found with element E

By convention, the expected outcome is Ok while the unexpected outcome is Err.

Like Option, Result has many methods associated with it. unwrap(), for example, either yields the element T or panics. For case handling, there are many combinators between Result and Option that overlap.

In working with Rust, you will likely encounter methods that return the Result type, such as the parse() method. It might not always be possible to parse a string into the other type, so parse() returns a Result indicating possible failure.

Let's see what happens when we successfully and unsuccessfully parse() a string:

fn multiply(first_number_str: &str, second_number_str: &str) -> i32 {
    // Let's try using `unwrap()` to get the number out. Will it bite us?
    let first_number = first_number_str.parse::<i32>().unwrap();
    let second_number = second_number_str.parse::<i32>().unwrap();
    first_number * second_number
}

fn main() {
    let twenty = multiply("10", "2");
    println!("double is {}", twenty);

    let tt = multiply("t", "2");
    println!("double is {}", tt);
}

In the unsuccessful case, parse() leaves us with an error for unwrap() to panic on. Additionally, the panic exits our program and provides an unpleasant error message.

To improve the quality of our error message, we should be more specific about the return type and consider explicitly handling the error.

Using Result in main

The Result type can also be the return type of the main function if specified explicitly. Typically the main function will be of the form:

fn main() {
    println!("Hello World!");
}

However main is also able to have a return type of Result. If an error occurs within the main function it will return an error code and print a debug representation of the error (using the Debug trait). The following example shows such a scenario and touches on aspects covered in the following section.

use std::num::ParseIntError;

fn main() -> Result<(), ParseIntError> {
    let number_str = "10";
    let number = match number_str.parse::<i32>() {
        Ok(number)  => number,
        Err(e) => return Err(e),
    };
    println!("{}", number);
    Ok(())
}

map for Result

Panicking in the previous example's multiply does not make for robust code. Generally, we want to return the error to the caller so it can decide what is the right way to respond to errors.

We first need to know what kind of error type we are dealing with. To determine the Err type, we look to parse(), which is implemented with the FromStr trait for i32. As a result, the Err type is specified as ParseIntError.

In the example below, the straightforward match statement leads to code that is overall more cumbersome.

use std::num::ParseIntError;

// With the return type rewritten, we use pattern matching without `unwrap()`.
fn multiply(first_number_str: &str, second_number_str: &str) -> Result<i32, ParseIntError> {
    match first_number_str.parse::<i32>() {
        Ok(first_number)  => {
            match second_number_str.parse::<i32>() {
                Ok(second_number)  => {
                    Ok(first_number * second_number)
                },
                Err(e) => Err(e),
            }
        },
        Err(e) => Err(e),
    }
}

fn print(result: Result<i32, ParseIntError>) {
    match result {
        Ok(n)  => println!("n is {}", n),
        Err(e) => println!("Error: {}", e),
    }
}

fn main() {
    // This still presents a reasonable answer.
    let twenty = multiply("10", "2");
    print(twenty);

    // The following now provides a much more helpful error message.
    let tt = multiply("t", "2");
    print(tt);
}

Luckily, Option's map, and_then, and many other combinators are also implemented for Result. Result contains a complete listing.

use std::num::ParseIntError;

// As with `Option`, we can use combinators such as `map()`.
// This function is otherwise identical to the one above and reads:
// Multiply if both values can be parsed from str, otherwise pass on the error.
fn multiply(first_number_str: &str, second_number_str: &str) -> Result<i32, ParseIntError> {
    first_number_str.parse::<i32>().and_then(|first_number| {
        second_number_str.parse::<i32>().map(|second_number| first_number * second_number)
    })
}

fn print(result: Result<i32, ParseIntError>) {
    match result {
        Ok(n)  => println!("n is {}", n),
        Err(e) => println!("Error: {}", e),
    }
}

fn main() {
    // This still presents a reasonable answer.
    let twenty = multiply("10", "2");
    print(twenty);

    // The following now provides a much more helpful error message.
    let tt = multiply("t", "2");
    print(tt);
}

aliases for Result

How about when we want to reuse a specific Result type many times? Recall that Rust allows us to create aliases. Conveniently, we can define one for the specific Result in question.

At a module level, creating aliases can be particularly helpful. Errors found in a specific module often have the same Err type, so a single alias can succinctly define all associated Results. This is so useful that the std library even supplies one: io::Result!

Here's a quick example to show off the syntax:

use std::num::ParseIntError;

// Define a generic alias for a `Result` with the error type `ParseIntError`.
type AliasedResult<T> = Result<T, ParseIntError>;

// Use the above alias to refer to our specific `Result` type.
fn multiply(first_number_str: &str, second_number_str: &str) -> AliasedResult<i32> {
    first_number_str.parse::<i32>().and_then(|first_number| {
        second_number_str.parse::<i32>().map(|second_number| first_number * second_number)
    })
}

// Here, the alias again allows us to save some space.
fn print(result: AliasedResult<i32>) {
    match result {
        Ok(n)  => println!("n is {}", n),
        Err(e) => println!("Error: {}", e),
    }
}

fn main() {
    print(multiply("10", "2"));
    print(multiply("t", "2"));
}

See also:

io::Result

Early returns

In the previous example, we explicitly handled the errors using combinators. Another way to deal with this case analysis is to use a combination of match statements and early returns.

That is, we can simply stop executing the function and return the error if one occurs. For some, this form of code can be easier to both read and write. Consider this version of the previous example, rewritten using early returns:

use std::num::ParseIntError;

fn multiply(first_number_str: &str, second_number_str: &str) -> Result<i32, ParseIntError> {
    let first_number = match first_number_str.parse::<i32>() {
        Ok(first_number)  => first_number,
        Err(e) => return Err(e),
    };

    let second_number = match second_number_str.parse::<i32>() {
        Ok(second_number)  => second_number,
        Err(e) => return Err(e),
    };

    Ok(first_number * second_number)
}

fn print(result: Result<i32, ParseIntError>) {
    match result {
        Ok(n)  => println!("n is {}", n),
        Err(e) => println!("Error: {}", e),
    }
}

fn main() {
    print(multiply("10", "2"));
    print(multiply("t", "2"));
}

At this point, we've learned to explicitly handle errors using combinators and early returns. While we generally want to avoid panicking, explicitly handling all of our errors is cumbersome.

In the next section, we'll introduce ? for the cases where we simply need to unwrap without possibly inducing panic.

Introducing ?

Sometimes we just want the simplicity of unwrap without the possibility of a panic. Until now, unwrap has forced us to nest deeper and deeper when what we really wanted was to get the variable out. This is exactly the purpose of ?.

Upon finding an Err, there are two valid actions to take:

  1. panic! which we already decided to try to avoid if possible
  2. return because an Err means it cannot be handled

? is almost1 exactly equivalent to an unwrap which returns instead of panicking on Errs. Let's see how we can simplify the earlier example that used combinators:

use std::num::ParseIntError;

fn multiply(first_number_str: &str, second_number_str: &str) -> Result<i32, ParseIntError> {
    let first_number = first_number_str.parse::<i32>()?;
    let second_number = second_number_str.parse::<i32>()?;

    Ok(first_number * second_number)
}

fn print(result: Result<i32, ParseIntError>) {
    match result {
        Ok(n)  => println!("n is {}", n),
        Err(e) => println!("Error: {}", e),
    }
}

fn main() {
    print(multiply("10", "2"));
    print(multiply("t", "2"));
}

The try! macro

Before there was ?, the same functionality was achieved with the try! macro. The ? operator is now recommended, but you may still find try! when looking at older code. The same multiply function from the previous example would look like this using try!:

// To compile and run this example without errors, while using Cargo, change the value 
// of the `edition` field, in the `[package]` section of the `Cargo.toml` file, to "2015".

use std::num::ParseIntError;

fn multiply(first_number_str: &str, second_number_str: &str) -> Result<i32, ParseIntError> {
    let first_number = try!(first_number_str.parse::<i32>());
    let second_number = try!(second_number_str.parse::<i32>());

    Ok(first_number * second_number)
}

fn print(result: Result<i32, ParseIntError>) {
    match result {
        Ok(n)  => println!("n is {}", n),
        Err(e) => println!("Error: {}", e),
    }
}

fn main() {
    print(multiply("10", "2"));
    print(multiply("t", "2"));
}
1

See re-enter ? for more details.

Multiple error types

The previous examples have always been very convenient; Results interact with other Results and Options interact with other Options.

Sometimes an Option needs to interact with a Result, or a Result<T, Error1> needs to interact with a Result<T, Error2>. In those cases, we want to manage our different error types in a way that makes them composable and easy to interact with.

In the following code, two instances of unwrap generate different error types. Vec::first returns an Option, while parse::<i32> returns a Result<i32, ParseIntError>:

fn double_first(vec: Vec<&str>) -> i32 {
    let first = vec.first().unwrap(); // Generate error 1
    2 * first.parse::<i32>().unwrap() // Generate error 2
}

fn main() {
    let numbers = vec!["42", "93", "18"];
    let empty = vec![];
    let strings = vec!["tofu", "93", "18"];

    println!("The first doubled is {}", double_first(numbers));

    println!("The first doubled is {}", double_first(empty));
    // Error 1: the input vector is empty

    println!("The first doubled is {}", double_first(strings));
    // Error 2: the element doesn't parse to a number
}

Over the next sections, we'll see several strategies for handling these kind of problems.

Pulling Results out of Options

The most basic way of handling mixed error types is to just embed them in each other.

use std::num::ParseIntError;

fn double_first(vec: Vec<&str>) -> Option<Result<i32, ParseIntError>> {
    vec.first().map(|first| {
        first.parse::<i32>().map(|n| 2 * n)
    })
}

fn main() {
    let numbers = vec!["42", "93", "18"];
    let empty = vec![];
    let strings = vec!["tofu", "93", "18"];

    println!("The first doubled is {:?}", double_first(numbers));

    println!("The first doubled is {:?}", double_first(empty));
    // Error 1: the input vector is empty

    println!("The first doubled is {:?}", double_first(strings));
    // Error 2: the element doesn't parse to a number
}

There are times when we'll want to stop processing on errors (like with ?) but keep going when the Option is None. A couple of combinators come in handy to swap the Result and Option.

use std::num::ParseIntError;

fn double_first(vec: Vec<&str>) -> Result<Option<i32>, ParseIntError> {
    let opt = vec.first().map(|first| {
        first.parse::<i32>().map(|n| 2 * n)
    });

    opt.map_or(Ok(None), |r| r.map(Some))
}

fn main() {
    let numbers = vec!["42", "93", "18"];
    let empty = vec![];
    let strings = vec!["tofu", "93", "18"];

    println!("The first doubled is {:?}", double_first(numbers));
    println!("The first doubled is {:?}", double_first(empty));
    println!("The first doubled is {:?}", double_first(strings));
}

Defining an error type

Sometimes it simplifies the code to mask all of the different errors with a single type of error. We'll show this with a custom error.

Rust allows us to define our own error types. In general, a "good" error type:

  • Represents different errors with the same type
  • Presents nice error messages to the user
  • Is easy to compare with other types
    • Good: Err(EmptyVec)
    • Bad: Err("Please use a vector with at least one element".to_owned())
  • Can hold information about the error
    • Good: Err(BadChar(c, position))
    • Bad: Err("+ cannot be used here".to_owned())
  • Composes well with other errors
use std::fmt;

type Result<T> = std::result::Result<T, DoubleError>;

// Define our error types. These may be customized for our error handling cases.
// Now we will be able to write our own errors, defer to an underlying error
// implementation, or do something in between.
#[derive(Debug, Clone)]
struct DoubleError;

// Generation of an error is completely separate from how it is displayed.
// There's no need to be concerned about cluttering complex logic with the display style.
//
// Note that we don't store any extra info about the errors. This means we can't state
// which string failed to parse without modifying our types to carry that information.
impl fmt::Display for DoubleError {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "invalid first item to double")
    }
}

fn double_first(vec: Vec<&str>) -> Result<i32> {
    vec.first()
        // Change the error to our new type.
        .ok_or(DoubleError)
        .and_then(|s| {
            s.parse::<i32>()
                // Update to the new error type here also.
                .map_err(|_| DoubleError)
                .map(|i| 2 * i)
        })
}

fn print(result: Result<i32>) {
    match result {
        Ok(n) => println!("The first doubled is {}", n),
        Err(e) => println!("Error: {}", e),
    }
}

fn main() {
    let numbers = vec!["42", "93", "18"];
    let empty = vec![];
    let strings = vec!["tofu", "93", "18"];

    print(double_first(numbers));
    print(double_first(empty));
    print(double_first(strings));
}

Boxing errors

A way to write simple code while preserving the original errors is to Box them. The drawback is that the underlying error type is only known at runtime and not statically determined.

The stdlib helps in boxing our errors by having Box implement conversion from any type that implements the Error trait into the trait object Box<Error>, via From.

use std::error;
use std::fmt;

// Change the alias to `Box<error::Error>`.
type Result<T> = std::result::Result<T, Box<dyn error::Error>>;

#[derive(Debug, Clone)]
struct EmptyVec;

impl fmt::Display for EmptyVec {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "invalid first item to double")
    }
}

impl error::Error for EmptyVec {}

fn double_first(vec: Vec<&str>) -> Result<i32> {
    vec.first()
        .ok_or_else(|| EmptyVec.into()) // Converts to Box
        .and_then(|s| {
            s.parse::<i32>()
                .map_err(|e| e.into()) // Converts to Box
                .map(|i| 2 * i)
        })
}

fn print(result: Result<i32>) {
    match result {
        Ok(n) => println!("The first doubled is {}", n),
        Err(e) => println!("Error: {}", e),
    }
}

fn main() {
    let numbers = vec!["42", "93", "18"];
    let empty = vec![];
    let strings = vec!["tofu", "93", "18"];

    print(double_first(numbers));
    print(double_first(empty));
    print(double_first(strings));
}

See also:

Dynamic dispatch and Error trait

Other uses of ?

Notice in the previous example that our immediate reaction to calling parse is to map the error from a library error into a boxed error:

.and_then(|s| s.parse::<i32>())
    .map_err(|e| e.into())

Since this is a simple and common operation, it would be convenient if it could be elided. Alas, because and_then is not sufficiently flexible, it cannot. However, we can instead use ?.

? was previously explained as either unwrap or return Err(err). This is only mostly true. It actually means unwrap or return Err(From::from(err)). Since From::from is a conversion utility between different types, this means that if you ? where the error is convertible to the return type, it will convert automatically.

Here, we rewrite the previous example using ?. As a result, the map_err will go away when From::from is implemented for our error type:

use std::error;
use std::fmt;

// Change the alias to `Box<dyn error::Error>`.
type Result<T> = std::result::Result<T, Box<dyn error::Error>>;

#[derive(Debug)]
struct EmptyVec;

impl fmt::Display for EmptyVec {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "invalid first item to double")
    }
}

impl error::Error for EmptyVec {}

// The same structure as before but rather than chain all `Results`
// and `Options` along, we `?` to get the inner value out immediately.
fn double_first(vec: Vec<&str>) -> Result<i32> {
    let first = vec.first().ok_or(EmptyVec)?;
    let parsed = first.parse::<i32>()?;
    Ok(2 * parsed)
}

fn print(result: Result<i32>) {
    match result {
        Ok(n)  => println!("The first doubled is {}", n),
        Err(e) => println!("Error: {}", e),
    }
}

fn main() {
    let numbers = vec!["42", "93", "18"];
    let empty = vec![];
    let strings = vec!["tofu", "93", "18"];

    print(double_first(numbers));
    print(double_first(empty));
    print(double_first(strings));
}

This is actually fairly clean now. Compared with the original panic, it is very similar to replacing the unwrap calls with ? except that the return types are Result. As a result, they must be destructured at the top level.

See also:

From::from and ?

Wrapping errors

An alternative to boxing errors is to wrap them in your own error type.

use std::error;
use std::error::Error;
use std::num::ParseIntError;
use std::fmt;

type Result<T> = std::result::Result<T, DoubleError>;

#[derive(Debug)]
enum DoubleError {
    EmptyVec,
    // We will defer to the parse error implementation for their error.
    // Supplying extra info requires adding more data to the type.
    Parse(ParseIntError),
}

impl fmt::Display for DoubleError {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        match *self {
            DoubleError::EmptyVec =>
                write!(f, "please use a vector with at least one element"),
            // The wrapped error contains additional information and is available
            // via the source() method.
            DoubleError::Parse(..) =>
                write!(f, "the provided string could not be parsed as int"),
        }
    }
}

impl error::Error for DoubleError {
    fn source(&self) -> Option<&(dyn error::Error + 'static)> {
        match *self {
            DoubleError::EmptyVec => None,
            // The cause is the underlying implementation error type. Is implicitly
            // cast to the trait object `&error::Error`. This works because the
            // underlying type already implements the `Error` trait.
            DoubleError::Parse(ref e) => Some(e),
        }
    }
}

// Implement the conversion from `ParseIntError` to `DoubleError`.
// This will be automatically called by `?` if a `ParseIntError`
// needs to be converted into a `DoubleError`.
impl From<ParseIntError> for DoubleError {
    fn from(err: ParseIntError) -> DoubleError {
        DoubleError::Parse(err)
    }
}

fn double_first(vec: Vec<&str>) -> Result<i32> {
    let first = vec.first().ok_or(DoubleError::EmptyVec)?;
    // Here we implicitly use the `ParseIntError` implementation of `From` (which
    // we defined above) in order to create a `DoubleError`.
    let parsed = first.parse::<i32>()?;

    Ok(2 * parsed)
}

fn print(result: Result<i32>) {
    match result {
        Ok(n)  => println!("The first doubled is {}", n),
        Err(e) => {
            println!("Error: {}", e);
            if let Some(source) = e.source() {
                println!("  Caused by: {}", source);
            }
        },
    }
}

fn main() {
    let numbers = vec!["42", "93", "18"];
    let empty = vec![];
    let strings = vec!["tofu", "93", "18"];

    print(double_first(numbers));
    print(double_first(empty));
    print(double_first(strings));
}

This adds a bit more boilerplate for handling errors and might not be needed in all applications. There are some libraries that can take care of the boilerplate for you.

See also:

From::from and Enums

Iterating over Results

An Iter::map operation might fail, for example:

fn main() {
    let strings = vec!["tofu", "93", "18"];
    let numbers: Vec<_> = strings
        .into_iter()
        .map(|s| s.parse::<i32>())
        .collect();
    println!("Results: {:?}", numbers);
}

Let's step through strategies for handling this.

Ignore the failed items with filter_map()

filter_map calls a function and filters out the results that are None.

fn main() {
    let strings = vec!["tofu", "93", "18"];
    let numbers: Vec<_> = strings
        .into_iter()
        .filter_map(|s| s.parse::<i32>().ok())
        .collect();
    println!("Results: {:?}", numbers);
}

Collect the failed items with map_err() and filter_map()

map_err calls a function with the error, so by adding that to the previous filter_map solution we can save them off to the side while iterating.

fn main() {
    let strings = vec!["42", "tofu", "93", "999", "18"];
    let mut errors = vec![];
    let numbers: Vec<_> = strings
        .into_iter()
        .map(|s| s.parse::<u8>())
        .filter_map(|r| r.map_err(|e| errors.push(e)).ok())
        .collect();
    println!("Numbers: {:?}", numbers);
    println!("Errors: {:?}", errors);
}

Fail the entire operation with collect()

Result implements FromIterator so that a vector of results (Vec<Result<T, E>>) can be turned into a result with a vector (Result<Vec<T>, E>). Once an Result::Err is found, the iteration will terminate.

fn main() {
    let strings = vec!["tofu", "93", "18"];
    let numbers: Result<Vec<_>, _> = strings
        .into_iter()
        .map(|s| s.parse::<i32>())
        .collect();
    println!("Results: {:?}", numbers);
}

This same technique can be used with Option.

Collect all valid values and failures with partition()

fn main() {
    let strings = vec!["tofu", "93", "18"];
    let (numbers, errors): (Vec<_>, Vec<_>) = strings
        .into_iter()
        .map(|s| s.parse::<i32>())
        .partition(Result::is_ok);
    println!("Numbers: {:?}", numbers);
    println!("Errors: {:?}", errors);
}

When you look at the results, you'll note that everything is still wrapped in Result. A little more boilerplate is needed for this.

fn main() {
    let strings = vec!["tofu", "93", "18"];
    let (numbers, errors): (Vec<_>, Vec<_>) = strings
        .into_iter()
        .map(|s| s.parse::<i32>())
        .partition(Result::is_ok);
    let numbers: Vec<_> = numbers.into_iter().map(Result::unwrap).collect();
    let errors: Vec<_> = errors.into_iter().map(Result::unwrap_err).collect();
    println!("Numbers: {:?}", numbers);
    println!("Errors: {:?}", errors);
}

Std library types

The std library provides many custom types which expands drastically on the primitives. Some of these include:

  • growable Strings like: "hello world"
  • growable vectors: [1, 2, 3]
  • optional types: Option<i32>
  • error handling types: Result<i32, i32>
  • heap allocated pointers: Box<i32>

See also:

primitives and the std library

Box, stack and heap

All values in Rust are stack allocated by default. Values can be boxed (allocated on the heap) by creating a Box<T>. A box is a smart pointer to a heap allocated value of type T. When a box goes out of scope, its destructor is called, the inner object is destroyed, and the memory on the heap is freed.

Boxed values can be dereferenced using the * operator; this removes one layer of indirection.

use std::mem;

#[allow(dead_code)]
#[derive(Debug, Clone, Copy)]
struct Point {
    x: f64,
    y: f64,
}

// A Rectangle can be specified by where its top left and bottom right 
// corners are in space
#[allow(dead_code)]
struct Rectangle {
    top_left: Point,
    bottom_right: Point,
}

fn origin() -> Point {
    Point { x: 0.0, y: 0.0 }
}

fn boxed_origin() -> Box<Point> {
    // Allocate this point on the heap, and return a pointer to it
    Box::new(Point { x: 0.0, y: 0.0 })
}

fn main() {
    // (all the type annotations are superfluous)
    // Stack allocated variables
    let point: Point = origin();
    let rectangle: Rectangle = Rectangle {
        top_left: origin(),
        bottom_right: Point { x: 3.0, y: -4.0 }
    };

    // Heap allocated rectangle
    let boxed_rectangle: Box<Rectangle> = Box::new(Rectangle {
        top_left: origin(),
        bottom_right: Point { x: 3.0, y: -4.0 },
    });

    // The output of functions can be boxed
    let boxed_point: Box<Point> = Box::new(origin());

    // Double indirection
    let box_in_a_box: Box<Box<Point>> = Box::new(boxed_origin());

    println!("Point occupies {} bytes on the stack",
             mem::size_of_val(&point));
    println!("Rectangle occupies {} bytes on the stack",
             mem::size_of_val(&rectangle));

    // box size == pointer size
    println!("Boxed point occupies {} bytes on the stack",
             mem::size_of_val(&boxed_point));
    println!("Boxed rectangle occupies {} bytes on the stack",
             mem::size_of_val(&boxed_rectangle));
    println!("Boxed box occupies {} bytes on the stack",
             mem::size_of_val(&box_in_a_box));

    // Copy the data contained in `boxed_point` into `unboxed_point`
    let unboxed_point: Point = *boxed_point;
    println!("Unboxed point occupies {} bytes on the stack",
             mem::size_of_val(&unboxed_point));
}

Vectors

Vectors are re-sizable arrays. Like slices, their size is not known at compile time, but they can grow or shrink at any time. A vector is represented using 3 parameters:

  • pointer to the data
  • length
  • capacity

The capacity indicates how much memory is reserved for the vector. The vector can grow as long as the length is smaller than the capacity. When this threshold needs to be surpassed, the vector is reallocated with a larger capacity:

fn main() {
    // Iterators can be collected into vectors
    let collected_iterator: Vec<i32> = (0..10).collect();
    println!("Collected (0..10) into: {:?}", collected_iterator);

    // The `vec!` macro can be used to initialize a vector
    let mut xs = vec![1i32, 2, 3];
    println!("Initial vector: {:?}", xs);

    // Insert new element at the end of the vector
    println!("Push 4 into the vector");
    xs.push(4);
    println!("Vector: {:?}", xs);

    // Error! Immutable vectors can't grow
    collected_iterator.push(0);
    // FIXME ^ Comment out this line

    // The `len` method yields the number of elements currently stored in a vector
    println!("Vector length: {}", xs.len());

    // Indexing is done using the square brackets (indexing starts at 0)
    println!("Second element: {}", xs[1]);

    // `pop` removes the last element from the vector and returns it
    println!("Pop last element: {:?}", xs.pop());

    // Out of bounds indexing yields a panic
    println!("Fourth element: {}", xs[3]);
    // FIXME ^ Comment out this line

    // `Vector`s can be easily iterated over
    println!("Contents of xs:");
    for x in xs.iter() {
        println!("> {}", x);
    }

    // A `Vector` can also be iterated over while the iteration
    // count is enumerated in a separate variable (`i`)
    for (i, x) in xs.iter().enumerate() {
        println!("In position {} we have value {}", i, x);
    }

    // Thanks to `iter_mut`, mutable `Vector`s can also be iterated
    // over in a way that allows modifying each value
    for x in xs.iter_mut() {
        *x *= 3;
    }
    println!("Updated vector: {:?}", xs);
}

More Vec methods can be found under the std::vec module

Strings

There are two types of strings in Rust: String and &str.

A String is stored as a vector of bytes (Vec<u8>), but guaranteed to always be a valid UTF-8 sequence. String is heap allocated, growable and not null terminated.

&str is a slice (&[u8]) that always points to a valid UTF-8 sequence, and can be used to view into a String, just like &[T] is a view into Vec<T>:

fn main() {
    // (all the type annotations are superfluous)
    // A reference to a string allocated in read only memory
    let pangram: &'static str = "the quick brown fox jumps over the lazy dog";
    println!("Pangram: {}", pangram);

    // Iterate over words in reverse, no new string is allocated
    println!("Words in reverse");
    for word in pangram.split_whitespace().rev() {
        println!("> {}", word);
    }

    // Copy chars into a vector, sort and remove duplicates
    let mut chars: Vec<char> = pangram.chars().collect();
    chars.sort();
    chars.dedup();

    // Create an empty and growable `String`
    let mut string = String::new();
    for c in chars {
        // Insert a char at the end of string
        string.push(c);
        // Insert a string at the end of string
        string.push_str(", ");
    }

    // The trimmed string is a slice to the original string, hence no new
    // allocation is performed
    let chars_to_trim: &[char] = &[' ', ','];
    let trimmed_str: &str = string.trim_matches(chars_to_trim);
    println!("Used characters: {}", trimmed_str);

    // Heap allocate a string
    let alice = String::from("I like dogs");
    // Allocate new memory and store the modified string there
    let bob: String = alice.replace("dog", "cat");

    println!("Alice says: {}", alice);
    println!("Bob says: {}", bob);
}

More str/String methods can be found under the std::str and std::string modules

Literals and escapes

There are multiple ways to write string literals with special characters in them. All result in a similar &str so it's best to use the form that is the most convenient to write. Similarly there are multiple ways to write byte string literals, which all result in &[u8; N].

Generally special characters are escaped with a backslash character: \. This way you can add any character to your string, even unprintable ones and ones that you don't know how to type. If you want a literal backslash, escape it with another one: \\

String or character literal delimiters occurring within a literal must be escaped: "\"", '\''.

fn main() {
    // You can use escapes to write bytes by their hexadecimal values...
    let byte_escape = "I'm writing \x52\x75\x73\x74!";
    println!("What are you doing\x3F (\\x3F means ?) {}", byte_escape);

    // ...or Unicode code points.
    let unicode_codepoint = "\u{211D}";
    let character_name = "\"DOUBLE-STRUCK CAPITAL R\"";

    println!("Unicode character {} (U+211D) is called {}",
                unicode_codepoint, character_name );


    let long_string = "String literals
                        can span multiple lines.
                        The linebreak and indentation here ->\
                        <- can be escaped too!";
    println!("{}", long_string);
}

Sometimes there are just too many characters that need to be escaped or it's just much more convenient to write a string out as-is. This is where raw string literals come into play.

fn main() {
    let raw_str = r"Escapes don't work here: \x3F \u{211D}";
    println!("{}", raw_str);

    // If you need quotes in a raw string, add a pair of #s
    let quotes = r#"And then I said: "There is no escape!""#;
    println!("{}", quotes);

    // If you need "# in your string, just use more #s in the delimiter.
    // You can use up to 65535 #s.
    let longer_delimiter = r###"A string with "# in it. And even "##!"###;
    println!("{}", longer_delimiter);
}

Want a string that's not UTF-8? (Remember, str and String must be valid UTF-8). Or maybe you want an array of bytes that's mostly text? Byte strings to the rescue!

use std::str;

fn main() {
    // Note that this is not actually a `&str`
    let bytestring: &[u8; 21] = b"this is a byte string";

    // Byte arrays don't have the `Display` trait, so printing them is a bit limited
    println!("A byte string: {:?}", bytestring);

    // Byte strings can have byte escapes...
    let escaped = b"\x52\x75\x73\x74 as bytes";
    // ...but no unicode escapes
    // let escaped = b"\u{211D} is not allowed";
    println!("Some escaped bytes: {:?}", escaped);


    // Raw byte strings work just like raw strings
    let raw_bytestring = br"\u{211D} is not escaped here";
    println!("{:?}", raw_bytestring);

    // Converting a byte array to `str` can fail
    if let Ok(my_str) = str::from_utf8(raw_bytestring) {
        println!("And the same as text: '{}'", my_str);
    }

    let _quotes = br#"You can also use "fancier" formatting, \
                    like with normal raw strings"#;

    // Byte strings don't have to be UTF-8
    let shift_jis = b"\x82\xe6\x82\xa8\x82\xb1\x82\xbb"; // "ようこそ" in SHIFT-JIS

    // But then they can't always be converted to `str`
    match str::from_utf8(shift_jis) {
        Ok(my_str) => println!("Conversion successful: '{}'", my_str),
        Err(e) => println!("Conversion failed: {:?}", e),
    };
}

For conversions between character encodings check out the encoding crate.

A more detailed listing of the ways to write string literals and escape characters is given in the 'Tokens' chapter of the Rust Reference.

Option

Sometimes it's desirable to catch the failure of some parts of a program instead of calling panic!; this can be accomplished using the Option enum.

The Option<T> enum has two variants:

  • None, to indicate failure or lack of value, and
  • Some(value), a tuple struct that wraps a value with type T.
// An integer division that doesn't `panic!`
fn checked_division(dividend: i32, divisor: i32) -> Option<i32> {
    if divisor == 0 {
        // Failure is represented as the `None` variant
        None
    } else {
        // Result is wrapped in a `Some` variant
        Some(dividend / divisor)
    }
}

// This function handles a division that may not succeed
fn try_division(dividend: i32, divisor: i32) {
    // `Option` values can be pattern matched, just like other enums
    match checked_division(dividend, divisor) {
        None => println!("{} / {} failed!", dividend, divisor),
        Some(quotient) => {
            println!("{} / {} = {}", dividend, divisor, quotient)
        },
    }
}

fn main() {
    try_division(4, 2);
    try_division(1, 0);

    // Binding `None` to a variable needs to be type annotated
    let none: Option<i32> = None;
    let _equivalent_none = None::<i32>;

    let optional_float = Some(0f32);

    // Unwrapping a `Some` variant will extract the value wrapped.
    println!("{:?} unwraps to {:?}", optional_float, optional_float.unwrap());

    // Unwrapping a `None` variant will `panic!`
    println!("{:?} unwraps to {:?}", none, none.unwrap());
}

Result

We've seen that the Option enum can be used as a return value from functions that may fail, where None can be returned to indicate failure. However, sometimes it is important to express why an operation failed. To do this we have the Result enum.

The Result<T, E> enum has two variants:

  • Ok(value) which indicates that the operation succeeded, and wraps the value returned by the operation. (value has type T)
  • Err(why), which indicates that the operation failed, and wraps why, which (hopefully) explains the cause of the failure. (why has type E)
mod checked {
    // Mathematical "errors" we want to catch
    #[derive(Debug)]
    pub enum MathError {
        DivisionByZero,
        NonPositiveLogarithm,
        NegativeSquareRoot,
    }

    pub type MathResult = Result<f64, MathError>;

    pub fn div(x: f64, y: f64) -> MathResult {
        if y == 0.0 {
            // This operation would `fail`, instead let's return the reason of
            // the failure wrapped in `Err`
            Err(MathError::DivisionByZero)
        } else {
            // This operation is valid, return the result wrapped in `Ok`
            Ok(x / y)
        }
    }

    pub fn sqrt(x: f64) -> MathResult {
        if x < 0.0 {
            Err(MathError::NegativeSquareRoot)
        } else {
            Ok(x.sqrt())
        }
    }

    pub fn ln(x: f64) -> MathResult {
        if x <= 0.0 {
            Err(MathError::NonPositiveLogarithm)
        } else {
            Ok(x.ln())
        }
    }
}

// `op(x, y)` === `sqrt(ln(x / y))`
fn op(x: f64, y: f64) -> f64 {
    // This is a three level match pyramid!
    match checked::div(x, y) {
        Err(why) => panic!("{:?}", why),
        Ok(ratio) => match checked::ln(ratio) {
            Err(why) => panic!("{:?}", why),
            Ok(ln) => match checked::sqrt(ln) {
                Err(why) => panic!("{:?}", why),
                Ok(sqrt) => sqrt,
            },
        },
    }
}

fn main() {
    // Will this fail?
    println!("{}", op(1.0, 10.0));
}

?

Chaining results using match can get pretty untidy; luckily, the ? operator can be used to make things pretty again. ? is used at the end of an expression returning a Result, and is equivalent to a match expression, where the Err(err) branch expands to an early return Err(From::from(err)), and the Ok(ok) branch expands to an ok expression:

mod checked {
    #[derive(Debug)]
    enum MathError {
        DivisionByZero,
        NonPositiveLogarithm,
        NegativeSquareRoot,
    }

    type MathResult = Result<f64, MathError>;

    fn div(x: f64, y: f64) -> MathResult {
        if y == 0.0 {
            Err(MathError::DivisionByZero)
        } else {
            Ok(x / y)
        }
    }

    fn sqrt(x: f64) -> MathResult {
        if x < 0.0 {
            Err(MathError::NegativeSquareRoot)
        } else {
            Ok(x.sqrt())
        }
    }

    fn ln(x: f64) -> MathResult {
        if x <= 0.0 {
            Err(MathError::NonPositiveLogarithm)
        } else {
            Ok(x.ln())
        }
    }

    // Intermediate function
    fn op_(x: f64, y: f64) -> MathResult {
        // if `div` "fails", then `DivisionByZero` will be `return`ed
        let ratio = div(x, y)?;

        // if `ln` "fails", then `NonPositiveLogarithm` will be `return`ed
        let ln = ln(ratio)?;

        sqrt(ln)
    }

    pub fn op(x: f64, y: f64) {
        match op_(x, y) {
            Err(why) => panic!("{}", match why {
                MathError::NonPositiveLogarithm
                    => "logarithm of non-positive number",
                MathError::DivisionByZero
                    => "division by zero",
                MathError::NegativeSquareRoot
                    => "square root of negative number",
            }),
            Ok(value) => println!("{}", value),
        }
    }
}

fn main() {
    checked::op(1.0, 10.0);
}

Be sure to check the documentation, as there are many methods to map/compose Result.

panic!

The panic! macro can be used to generate a panic and start unwinding its stack. While unwinding, the runtime will take care of freeing all the resources owned by the thread by calling the destructor of all its objects.

Since we are dealing with programs with only one thread, panic! will cause the program to report the panic message and exit.

// Re-implementation of integer division (/)
fn division(dividend: i32, divisor: i32) -> i32 {
    if divisor == 0 {
        // Division by zero triggers a panic
        panic!("division by zero");
    } else {
        dividend / divisor
    }
}

// The `main` task
fn main() {
    // Heap allocated integer
    let _x = Box::new(0i32);

    // This operation will trigger a task failure
    division(3, 0);

    println!("This point won't be reached!");

    // `_x` should get destroyed at this point
}

Let's check that panic! doesn't leak memory.

$ rustc panic.rs && valgrind ./panic
==4401== Memcheck, a memory error detector
==4401== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==4401== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info
==4401== Command: ./panic
==4401== 
thread '<main>' panicked at 'division by zero', panic.rs:5
==4401== 
==4401== HEAP SUMMARY:
==4401==     in use at exit: 0 bytes in 0 blocks
==4401==   total heap usage: 18 allocs, 18 frees, 1,648 bytes allocated
==4401== 
==4401== All heap blocks were freed -- no leaks are possible
==4401== 
==4401== For counts of detected and suppressed errors, rerun with: -v
==4401== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

HashMap

Where vectors store values by an integer index, HashMaps store values by key. HashMap keys can be booleans, integers, strings, or any other type that implements the Eq and Hash traits. More on this in the next section.

Like vectors, HashMaps are growable, but HashMaps can also shrink themselves when they have excess space. You can create a HashMap with a certain starting capacity using HashMap::with_capacity(uint), or use HashMap::new() to get a HashMap with a default initial capacity (recommended):

use std::collections::HashMap;

fn call(number: &str) -> &str {
    match number {
        "798-1364" => "We're sorry, the call cannot be completed as dialed. 
            Please hang up and try again.",
        "645-7689" => "Hello, this is Mr. Awesome's Pizza. My name is Fred.
            What can I get for you today?",
        _ => "Hi! Who is this again?"
    }
}

fn main() { 
    let mut contacts = HashMap::new();

    contacts.insert("Daniel", "798-1364");
    contacts.insert("Ashley", "645-7689");
    contacts.insert("Katie", "435-8291");
    contacts.insert("Robert", "956-1745");

    // Takes a reference and returns Option<&V>
    match contacts.get(&"Daniel") {
        Some(&number) => println!("Calling Daniel: {}", call(number)),
        _ => println!("Don't have Daniel's number."),
    }

    // `HashMap::insert()` returns `None`
    // if the inserted value is new, `Some(value)` otherwise
    contacts.insert("Daniel", "164-6743");

    match contacts.get(&"Ashley") {
        Some(&number) => println!("Calling Ashley: {}", call(number)),
        _ => println!("Don't have Ashley's number."),
    }

    contacts.remove(&"Ashley"); 

    // `HashMap::iter()` returns an iterator that yields 
    // (&'a key, &'a value) pairs in arbitrary order.
    for (contact, &number) in contacts.iter() {
        println!("Calling {}: {}", contact, call(number)); 
    }
}

For more information on how hashing and hash maps (sometimes called hash tables) work, have a look at Hash Table Wikipedia

Alternate/custom key types

Any type that implements the Eq and Hash traits can be a key in HashMap. This includes:

  • bool (though not very useful since there is only two possible keys)
  • int, uint, and all variations thereof
  • String and &str (protip: you can have a HashMap keyed by String and call .get() with an &str)

Note that f32 and f64 do not implement Hash, likely because floating-point precision errors would make using them as hashmap keys horribly error-prone.

All collection classes implement Eq and Hash if their contained type also respectively implements Eq and Hash. For example, Vec<T> will implement Hash if T implements Hash.

You can easily implement Eq and Hash for a custom type with just one line: #[derive(PartialEq, Eq, Hash)]

The compiler will do the rest. If you want more control over the details, you can implement Eq and/or Hash yourself. This guide will not cover the specifics of implementing Hash.

To play around with using a struct in HashMap, let's try making a very simple user logon system:

use std::collections::HashMap;

// Eq requires that you derive PartialEq on the type.
#[derive(PartialEq, Eq, Hash)]
struct Account<'a>{
    username: &'a str,
    password: &'a str,
}

struct AccountInfo<'a>{
    name: &'a str,
    email: &'a str,
}

type Accounts<'a> = HashMap<Account<'a>, AccountInfo<'a>>;

fn try_logon<'a>(accounts: &Accounts<'a>,
        username: &'a str, password: &'a str){
    println!("Username: {}", username);
    println!("Password: {}", password);
    println!("Attempting logon...");

    let logon = Account {
        username,
        password,
    };

    match accounts.get(&logon) {
        Some(account_info) => {
            println!("Successful logon!");
            println!("Name: {}", account_info.name);
            println!("Email: {}", account_info.email);
        },
        _ => println!("Login failed!"),
    }
}

fn main(){
    let mut accounts: Accounts = HashMap::new();

    let account = Account {
        username: "j.everyman",
        password: "password123",
    };

    let account_info = AccountInfo {
        name: "John Everyman",
        email: "j.everyman@email.com",
    };

    accounts.insert(account, account_info);

    try_logon(&accounts, "j.everyman", "psasword123");

    try_logon(&accounts, "j.everyman", "password123");
}

HashSet

Consider a HashSet as a HashMap where we just care about the keys (HashSet<T> is, in actuality, just a wrapper around HashMap<T, ()>).

"What's the point of that?" you ask. "I could just store the keys in a Vec."

A HashSet's unique feature is that it is guaranteed to not have duplicate elements. That's the contract that any set collection fulfills. HashSet is just one implementation. (see also: BTreeSet)

If you insert a value that is already present in the HashSet, (i.e. the new value is equal to the existing and they both have the same hash), then the new value will replace the old.

This is great for when you never want more than one of something, or when you want to know if you've already got something.

But sets can do more than that.

Sets have 4 primary operations (all of the following calls return an iterator):

  • union: get all the unique elements in both sets.

  • difference: get all the elements that are in the first set but not the second.

  • intersection: get all the elements that are only in both sets.

  • symmetric_difference: get all the elements that are in one set or the other, but not both.

Try all of these in the following example:

use std::collections::HashSet;

fn main() {
    let mut a: HashSet<i32> = vec![1i32, 2, 3].into_iter().collect();
    let mut b: HashSet<i32> = vec![2i32, 3, 4].into_iter().collect();

    assert!(a.insert(4));
    assert!(a.contains(&4));

    // `HashSet::insert()` returns false if
    // there was a value already present.
    assert!(b.insert(4), "Value 4 is already in set B!");
    // FIXME ^ Comment out this line

    b.insert(5);

    // If a collection's element type implements `Debug`,
    // then the collection implements `Debug`.
    // It usually prints its elements in the format `[elem1, elem2, ...]`
    println!("A: {:?}", a);
    println!("B: {:?}", b);

    // Print [1, 2, 3, 4, 5] in arbitrary order
    println!("Union: {:?}", a.union(&b).collect::<Vec<&i32>>());

    // This should print [1]
    println!("Difference: {:?}", a.difference(&b).collect::<Vec<&i32>>());

    // Print [2, 3, 4] in arbitrary order.
    println!("Intersection: {:?}", a.intersection(&b).collect::<Vec<&i32>>());

    // Print [1, 5]
    println!("Symmetric Difference: {:?}",
             a.symmetric_difference(&b).collect::<Vec<&i32>>());
}

(Examples are adapted from the documentation.)

Rc

When multiple ownership is needed, Rc(Reference Counting) can be used. Rc keeps track of the number of the references which means the number of owners of the value wrapped inside an Rc.

Reference count of an Rc increases by 1 whenever an Rc is cloned, and decreases by 1 whenever one cloned Rc is dropped out of the scope. When an Rc's reference count becomes zero (which means there are no remaining owners), both the Rc and the value are all dropped.

Cloning an Rc never performs a deep copy. Cloning creates just another pointer to the wrapped value, and increments the count:

use std::rc::Rc;

fn main() {
    let rc_examples = "Rc examples".to_string();
    {
        println!("--- rc_a is created ---");
        
        let rc_a: Rc<String> = Rc::new(rc_examples);
        println!("Reference Count of rc_a: {}", Rc::strong_count(&rc_a));
        
        {
            println!("--- rc_a is cloned to rc_b ---");
            
            let rc_b: Rc<String> = Rc::clone(&rc_a);
            println!("Reference Count of rc_b: {}", Rc::strong_count(&rc_b));
            println!("Reference Count of rc_a: {}", Rc::strong_count(&rc_a));
            
            // Two `Rc`s are equal if their inner values are equal
            println!("rc_a and rc_b are equal: {}", rc_a.eq(&rc_b));
            
            // We can use methods of a value directly
            println!("Length of the value inside rc_a: {}", rc_a.len());
            println!("Value of rc_b: {}", rc_b);
            
            println!("--- rc_b is dropped out of scope ---");
        }
        
        println!("Reference Count of rc_a: {}", Rc::strong_count(&rc_a));
        
        println!("--- rc_a is dropped out of scope ---");
    }
    
    // Error! `rc_examples` already moved into `rc_a`
    // And when `rc_a` is dropped, `rc_examples` is dropped together
    // println!("rc_examples: {}", rc_examples);
    // TODO ^ Try uncommenting this line
}

See also:

std::rc and std::sync::arc.

Arc

When shared ownership between threads is needed, Arc(Atomically Reference Counted) can be used. This struct, via the Clone implementation can create a reference pointer for the location of a value in the memory heap while increasing the reference counter. As it shares ownership between threads, when the last reference pointer to a value is out of scope, the variable is dropped:

use std::time::Duration;
use std::sync::Arc;
use std::thread;

fn main() {
    // This variable declaration is where its value is specified.
    let apple = Arc::new("the same apple");

    for _ in 0..10 {
        // Here there is no value specification as it is a pointer to a
        // reference in the memory heap.
        let apple = Arc::clone(&apple);

        thread::spawn(move || {
            // As Arc was used, threads can be spawned using the value allocated
            // in the Arc variable pointer's location.
            println!("{:?}", apple);
        });
    }

    // Make sure all Arc instances are printed from spawned threads.
    thread::sleep(Duration::from_secs(1));
}

Std misc

Many other types are provided by the std library to support things such as:

  • Threads
  • Channels
  • File I/O

These expand beyond what the primitives provide.

See also:

primitives and the std library

Threads

Rust provides a mechanism for spawning native OS threads via the spawn function, the argument of this function is a moving closure.

use std::thread;

const NTHREADS: u32 = 10;

// This is the `main` thread
fn main() {
    // Make a vector to hold the children which are spawned.
    let mut children = vec![];

    for i in 0..NTHREADS {
        // Spin up another thread
        children.push(thread::spawn(move || {
            println!("this is thread number {}", i);
        }));
    }

    for child in children {
        // Wait for the thread to finish. Returns a result.
        let _ = child.join();
    }
}

These threads will be scheduled by the OS.

Testcase: map-reduce

Rust makes it very easy to parallelise data processing, without many of the headaches traditionally associated with such an attempt.

The standard library provides great threading primitives out of the box. These, combined with Rust's concept of Ownership and aliasing rules, automatically prevent data races.

The aliasing rules (one writable reference XOR many readable references) automatically prevent you from manipulating state that is visible to other threads. (Where synchronisation is needed, there are synchronisation primitives like Mutexes or Channels.)

In this example, we will calculate the sum of all digits in a block of numbers. We will do this by parcelling out chunks of the block into different threads. Each thread will sum its tiny block of digits, and subsequently we will sum the intermediate sums produced by each thread.

Note that, although we're passing references across thread boundaries, Rust understands that we're only passing read-only references, and that thus no unsafety or data races can occur. Also because the references we're passing have 'static lifetimes, Rust understands that our data won't be destroyed while these threads are still running. (When you need to share non-static data between threads, you can use a smart pointer like Arc to keep the data alive and avoid non-static lifetimes.)

use std::thread;

// This is the `main` thread
fn main() {

    // This is our data to process.
    // We will calculate the sum of all digits via a threaded  map-reduce algorithm.
    // Each whitespace separated chunk will be handled in a different thread.
    //
    // TODO: see what happens to the output if you insert spaces!
    let data = "86967897737416471853297327050364959
11861322575564723963297542624962850
70856234701860851907960690014725639
38397966707106094172783238747669219
52380795257888236525459303330302837
58495327135744041048897885734297812
69920216438980873548808413720956532
16278424637452589860345374828574668";

    // Make a vector to hold the child-threads which we will spawn.
    let mut children = vec![];

    /*************************************************************************
     * "Map" phase
     *
     * Divide our data into segments, and apply initial processing
     ************************************************************************/

    // split our data into segments for individual calculation
    // each chunk will be a reference (&str) into the actual data
    let chunked_data = data.split_whitespace();

    // Iterate over the data segments.
    // .enumerate() adds the current loop index to whatever is iterated
    // the resulting tuple "(index, element)" is then immediately
    // "destructured" into two variables, "i" and "data_segment" with a
    // "destructuring assignment"
    for (i, data_segment) in chunked_data.enumerate() {
        println!("data segment {} is \"{}\"", i, data_segment);

        // Process each data segment in a separate thread
        //
        // spawn() returns a handle to the new thread,
        // which we MUST keep to access the returned value
        //
        // 'move || -> u32' is syntax for a closure that:
        // * takes no arguments ('||')
        // * takes ownership of its captured variables ('move') and
        // * returns an unsigned 32-bit integer ('-> u32')
        //
        // Rust is smart enough to infer the '-> u32' from
        // the closure itself so we could have left that out.
        //
        // TODO: try removing the 'move' and see what happens
        children.push(thread::spawn(move || -> u32 {
            // Calculate the intermediate sum of this segment:
            let result = data_segment
                        // iterate over the characters of our segment..
                        .chars()
                        // .. convert text-characters to their number value..
                        .map(|c| c.to_digit(10).expect("should be a digit"))
                        // .. and sum the resulting iterator of numbers
                        .sum();

            // println! locks stdout, so no text-interleaving occurs
            println!("processed segment {}, result={}", i, result);

            // "return" not needed, because Rust is an "expression language", the
            // last evaluated expression in each block is automatically its value.
            result

        }));
    }


    /*************************************************************************
     * "Reduce" phase
     *
     * Collect our intermediate results, and combine them into a final result
     ************************************************************************/

    // combine each thread's intermediate results into a single final sum.
    //
    // we use the "turbofish" ::<> to provide sum() with a type hint.
    //
    // TODO: try without the turbofish, by instead explicitly
    // specifying the type of final_result
    let final_result = children.into_iter().map(|c| c.join().unwrap()).sum::<u32>();

    println!("Final sum result: {}", final_result);
}

Assignments

It is not wise to let our number of threads depend on user inputted data. What if the user decides to insert a lot of spaces? Do we really want to spawn 2,000 threads? Modify the program so that the data is always chunked into a limited number of chunks, defined by a static constant at the beginning of the program.

See also:

Channels

Rust provides asynchronous channels for communication between threads. Channels allow a unidirectional flow of information between two end-points: the Sender and the Receiver.

use std::sync::mpsc::{Sender, Receiver};
use std::sync::mpsc;
use std::thread;

static NTHREADS: i32 = 3;

fn main() {
    // Channels have two endpoints: the `Sender<T>` and the `Receiver<T>`,
    // where `T` is the type of the message to be transferred
    // (type annotation is superfluous)
    let (tx, rx): (Sender<i32>, Receiver<i32>) = mpsc::channel();
    let mut children = Vec::new();

    for id in 0..NTHREADS {
        // The sender endpoint can be copied
        let thread_tx = tx.clone();

        // Each thread will send its id via the channel
        let child = thread::spawn(move || {
            // The thread takes ownership over `thread_tx`
            // Each thread queues a message in the channel
            thread_tx.send(id).unwrap();

            // Sending is a non-blocking operation, the thread will continue
            // immediately after sending its message
            println!("thread {} finished", id);
        });

        children.push(child);
    }

    // Here, all the messages are collected
    let mut ids = Vec::with_capacity(NTHREADS as usize);
    for _ in 0..NTHREADS {
        // The `recv` method picks a message from the channel
        // `recv` will block the current thread if there are no messages available
        ids.push(rx.recv());
    }
    
    // Wait for the threads to complete any remaining work
    for child in children {
        child.join().expect("oops! the child thread panicked");
    }

    // Show the order in which the messages were sent
    println!("{:?}", ids);
}

Path

The Path struct represents file paths in the underlying filesystem. There are two flavors of Path: posix::Path, for UNIX-like systems, and windows::Path, for Windows. The prelude exports the appropriate platform-specific Path variant.

A Path can be created from an OsStr, and provides several methods to get information from the file/directory the path points to.

A Path is immutable. The owned version of Path is PathBuf. The relation between Path and PathBuf is similar to that of str and String: a PathBuf can be mutated in-place, and can be dereferenced to a Path.

Note that a Path is not internally represented as an UTF-8 string, but instead is stored as an OsString. Therefore, converting a Path to a &str is not free and may fail (an Option is returned). However, a Path can be freely converted to an OsString or &OsStr using into_os_string and as_os_str, respectively.

use std::path::Path;

fn main() {
    // Create a `Path` from an `&'static str`
    let path = Path::new(".");

    // The `display` method returns a `Display`able structure
    let _display = path.display();

    // `join` merges a path with a byte container using the OS specific
    // separator, and returns a `PathBuf`
    let mut new_path = path.join("a").join("b");

    // `push` extends the `PathBuf` with a `&Path`
    new_path.push("c");
    new_path.push("myfile.tar.gz");

    // `set_file_name` updates the file name of the `PathBuf`
    new_path.set_file_name("package.tgz");

    // Convert the `PathBuf` into a string slice
    match new_path.to_str() {
        None => panic!("new path is not a valid UTF-8 sequence"),
        Some(s) => println!("new path is {}", s),
    }
}

Be sure to check at other Path methods (posix::Path or windows::Path) and the Metadata struct.

See also:

OsStr and Metadata.

File I/O

The File struct represents a file that has been opened (it wraps a file descriptor), and gives read and/or write access to the underlying file.

Since many things can go wrong when doing file I/O, all the File methods return the io::Result<T> type, which is an alias for Result<T, io::Error>.

This makes the failure of all I/O operations explicit. Thanks to this, the programmer can see all the failure paths, and is encouraged to handle them in a proactive manner.

open

The open function can be used to open a file in read-only mode.

A File owns a resource, the file descriptor and takes care of closing the file when it is droped.

use std::fs::File;
use std::io::prelude::*;
use std::path::Path;

fn main() {
    // Create a path to the desired file
    let path = Path::new("hello.txt");
    let display = path.display();

    // Open the path in read-only mode, returns `io::Result<File>`
    let mut file = match File::open(&path) {
        Err(why) => panic!("couldn't open {}: {}", display, why),
        Ok(file) => file,
    };

    // Read the file contents into a string, returns `io::Result<usize>`
    let mut s = String::new();
    match file.read_to_string(&mut s) {
        Err(why) => panic!("couldn't read {}: {}", display, why),
        Ok(_) => print!("{} contains:\n{}", display, s),
    }

    // `file` goes out of scope, and the "hello.txt" file gets closed
}

Here's the expected successful output:

$ echo "Hello World!" > hello.txt
$ rustc open.rs && ./open
hello.txt contains:
Hello World!

(You are encouraged to test the previous example under different failure conditions: hello.txt doesn't exist, or hello.txt is not readable, etc.)

create

The create function opens a file in write-only mode. If the file already existed, the old content is destroyed. Otherwise, a new file is created.

static LOREM_IPSUM: &str =
    "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non
proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
";

use std::fs::File;
use std::io::prelude::*;
use std::path::Path;

fn main() {
    let path = Path::new("lorem_ipsum.txt");
    let display = path.display();

    // Open a file in write-only mode, returns `io::Result<File>`
    let mut file = match File::create(&path) {
        Err(why) => panic!("couldn't create {}: {}", display, why),
        Ok(file) => file,
    };

    // Write the `LOREM_IPSUM` string to `file`, returns `io::Result<()>`
    match file.write_all(LOREM_IPSUM.as_bytes()) {
        Err(why) => panic!("couldn't write to {}: {}", display, why),
        Ok(_) => println!("successfully wrote to {}", display),
    }
}

Here's the expected successful output:

$ rustc create.rs && ./create
successfully wrote to lorem_ipsum.txt
$ cat lorem_ipsum.txt
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non
proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

(As in the previous example, you are encouraged to test this example under failure conditions.)

The OpenOptions struct can be used to configure how a file is opened.

read_lines

A naive approach

This might be a reasonable first attempt for a beginner's first implementation for reading lines from a file.

#![allow(unused)]
fn main() {
use std::fs::read_to_string;

fn read_lines(filename: &str) -> Vec<String> {
    let mut result = Vec::new();

    for line in read_to_string(filename).unwrap().lines() {
        result.push(line.to_string())
    }

    result
}
}

Since the method lines() returns an iterator over the lines in the file, we can also perform a map inline and collect the results, yielding a more concise and fluent expression.

#![allow(unused)]
fn main() {
use std::fs::read_to_string;

fn read_lines(filename: &str) -> Vec<String> {
    read_to_string(filename) 
        .unwrap()  // panic on possible file-reading errors
        .lines()  // split the string into an iterator of string slices
        .map(String::from)  // make each slice into a string
        .collect()  // gather them together into a vector
}
}

Note that in both examples above, we must convert the &str reference returned from lines() to the owned type String, using .to_string() and String::from respectively.

A more efficient approach

Here we pass ownership of the open File to a BufReader struct. BufReader uses an internal buffer to reduce intermediate allocations.

We also update read_lines to return an iterator instead of allocating new String objects in memory for each line.

use std::fs::File;
use std::io::{self, BufRead};
use std::path::Path;

fn main() {
    // File hosts.txt must exist in the current path
    if let Ok(lines) = read_lines("./hosts.txt") {
        // Consumes the iterator, returns an (Optional) String
        for line in lines {
            if let Ok(ip) = line {
                println!("{}", ip);
            }
        }
    }
}

// The output is wrapped in a Result to allow matching on errors
// Returns an Iterator to the Reader of the lines of the file.
fn read_lines<P>(filename: P) -> io::Result<io::Lines<io::BufReader<File>>>
where P: AsRef<Path>, {
    let file = File::open(filename)?;
    Ok(io::BufReader::new(file).lines())
}

Running this program simply prints the lines individually.

$ echo -e "127.0.0.1\n192.168.0.1\n" > hosts.txt
$ rustc read_lines.rs && ./read_lines
127.0.0.1
192.168.0.1

(Note that since File::open expects a generic AsRef<Path> as argument, we define our generic read_lines() method with the same generic constraint, using the where keyword.)

This process is more efficient than creating a String in memory with all of the file's contents. This can especially cause performance issues when working with larger files.

Child processes

The process::Output struct represents the output of a finished child process, and the process::Command struct is a process builder.

use std::process::Command;

fn main() {
    let output = Command::new("rustc")
        .arg("--version")
        .output().unwrap_or_else(|e| {
            panic!("failed to execute process: {}", e)
    });

    if output.status.success() {
        let s = String::from_utf8_lossy(&output.stdout);

        print!("rustc succeeded and stdout was:\n{}", s);
    } else {
        let s = String::from_utf8_lossy(&output.stderr);

        print!("rustc failed and stderr was:\n{}", s);
    }
}

(You are encouraged to try the previous example with an incorrect flag passed to rustc)

Pipes

The std::Child struct represents a running child process, and exposes the stdin, stdout and stderr handles for interaction with the underlying process via pipes.

use std::io::prelude::*;
use std::process::{Command, Stdio};

static PANGRAM: &'static str =
"the quick brown fox jumped over the lazy dog\n";

fn main() {
    // Spawn the `wc` command
    let process = match Command::new("wc")
                                .stdin(Stdio::piped())
                                .stdout(Stdio::piped())
                                .spawn() {
        Err(why) => panic!("couldn't spawn wc: {}", why),
        Ok(process) => process,
    };

    // Write a string to the `stdin` of `wc`.
    //
    // `stdin` has type `Option<ChildStdin>`, but since we know this instance
    // must have one, we can directly `unwrap` it.
    match process.stdin.unwrap().write_all(PANGRAM.as_bytes()) {
        Err(why) => panic!("couldn't write to wc stdin: {}", why),
        Ok(_) => println!("sent pangram to wc"),
    }

    // Because `stdin` does not live after the above calls, it is `drop`ed,
    // and the pipe is closed.
    //
    // This is very important, otherwise `wc` wouldn't start processing the
    // input we just sent.

    // The `stdout` field also has type `Option<ChildStdout>` so must be unwrapped.
    let mut s = String::new();
    match process.stdout.unwrap().read_to_string(&mut s) {
        Err(why) => panic!("couldn't read wc stdout: {}", why),
        Ok(_) => print!("wc responded with:\n{}", s),
    }
}

Wait

If you'd like to wait for a process::Child to finish, you must call Child::wait, which will return a process::ExitStatus.

use std::process::Command;

fn main() {
    let mut child = Command::new("sleep").arg("5").spawn().unwrap();
    let _result = child.wait().unwrap();

    println!("reached end of main");
}
$ rustc wait.rs && ./wait
# `wait` keeps running for 5 seconds until the `sleep 5` command finishes
reached end of main

Filesystem Operations

The std::fs module contains several functions that deal with the filesystem.

use std::fs;
use std::fs::{File, OpenOptions};
use std::io;
use std::io::prelude::*;
use std::os::unix;
use std::path::Path;

// A simple implementation of `% cat path`
fn cat(path: &Path) -> io::Result<String> {
    let mut f = File::open(path)?;
    let mut s = String::new();
    match f.read_to_string(&mut s) {
        Ok(_) => Ok(s),
        Err(e) => Err(e),
    }
}

// A simple implementation of `% echo s > path`
fn echo(s: &str, path: &Path) -> io::Result<()> {
    let mut f = File::create(path)?;

    f.write_all(s.as_bytes())
}

// A simple implementation of `% touch path` (ignores existing files)
fn touch(path: &Path) -> io::Result<()> {
    match OpenOptions::new().create(true).write(true).open(path) {
        Ok(_) => Ok(()),
        Err(e) => Err(e),
    }
}

fn main() {
    println!("`mkdir a`");
    // Create a directory, returns `io::Result<()>`
    match fs::create_dir("a") {
        Err(why) => println!("! {:?}", why.kind()),
        Ok(_) => {},
    }

    println!("`echo hello > a/b.txt`");
    // The previous match can be simplified using the `unwrap_or_else` method
    echo("hello", &Path::new("a/b.txt")).unwrap_or_else(|why| {
        println!("! {:?}", why.kind());
    });

    println!("`mkdir -p a/c/d`");
    // Recursively create a directory, returns `io::Result<()>`
    fs::create_dir_all("a/c/d").unwrap_or_else(|why| {
        println!("! {:?}", why.kind());
    });

    println!("`touch a/c/e.txt`");
    touch(&Path::new("a/c/e.txt")).unwrap_or_else(|why| {
        println!("! {:?}", why.kind());
    });

    println!("`ln -s ../b.txt a/c/b.txt`");
    // Create a symbolic link, returns `io::Result<()>`
    if cfg!(target_family = "unix") {
        unix::fs::symlink("../b.txt", "a/c/b.txt").unwrap_or_else(|why| {
            println!("! {:?}", why.kind());
        });
    }

    println!("`cat a/c/b.txt`");
    match cat(&Path::new("a/c/b.txt")) {
        Err(why) => println!("! {:?}", why.kind()),
        Ok(s) => println!("> {}", s),
    }

    println!("`ls a`");
    // Read the contents of a directory, returns `io::Result<Vec<Path>>`
    match fs::read_dir("a") {
        Err(why) => println!("! {:?}", why.kind()),
        Ok(paths) => for path in paths {
            println!("> {:?}", path.unwrap().path());
        },
    }

    println!("`rm a/c/e.txt`");
    // Remove a file, returns `io::Result<()>`
    fs::remove_file("a/c/e.txt").unwrap_or_else(|why| {
        println!("! {:?}", why.kind());
    });

    println!("`rmdir a/c/d`");
    // Remove an empty directory, returns `io::Result<()>`
    fs::remove_dir("a/c/d").unwrap_or_else(|why| {
        println!("! {:?}", why.kind());
    });
}

Here's the expected successful output:

$ rustc fs.rs && ./fs
`mkdir a`
`echo hello > a/b.txt`
`mkdir -p a/c/d`
`touch a/c/e.txt`
`ln -s ../b.txt a/c/b.txt`
`cat a/c/b.txt`
> hello
`ls a`
> "a/b.txt"
> "a/c"
`rm a/c/e.txt`
`rmdir a/c/d`

And the final state of the a directory is:

$ tree a
a
|-- b.txt
`-- c
    `-- b.txt -> ../b.txt

1 directory, 2 files

An alternative way to define the function cat is with ? notation:

fn cat(path: &Path) -> io::Result<String> {
    let mut f = File::open(path)?;
    let mut s = String::new();
    f.read_to_string(&mut s)?;
    Ok(s)
}

See also:

cfg!

Program arguments

Standard Library

The command line arguments can be accessed using std::env::args, which returns an iterator that yields a String for each argument:

use std::env;

fn main() {
    let args: Vec<String> = env::args().collect();

    // The first argument is the path that was used to call the program.
    println!("My path is {}.", args[0]);

    // The rest of the arguments are the passed command line parameters.
    // Call the program like this:
    //   $ ./args arg1 arg2
    println!("I got {:?} arguments: {:?}.", args.len() - 1, &args[1..]);
}
$ ./args 1 2 3
My path is ./args.
I got 3 arguments: ["1", "2", "3"].

Crates

Alternatively, there are numerous crates that can provide extra functionality when creating command-line applications. The Rust Cookbook exhibits best practices on how to use one of the more popular command line argument crates, clap.

Argument parsing

Matching can be used to parse simple arguments:

use std::env;

fn increase(number: i32) {
    println!("{}", number + 1);
}

fn decrease(number: i32) {
    println!("{}", number - 1);
}

fn help() {
    println!("usage:
match_args <string>
    Check whether given string is the answer.
match_args {{increase|decrease}} <integer>
    Increase or decrease given integer by one.");
}

fn main() {
    let args: Vec<String> = env::args().collect();

    match args.len() {
        // no arguments passed
        1 => {
            println!("My name is 'match_args'. Try passing some arguments!");
        },
        // one argument passed
        2 => {
            match args[1].parse() {
                Ok(42) => println!("This is the answer!"),
                _ => println!("This is not the answer."),
            }
        },
        // one command and one argument passed
        3 => {
            let cmd = &args[1];
            let num = &args[2];
            // parse the number
            let number: i32 = match num.parse() {
                Ok(n) => {
                    n
                },
                Err(_) => {
                    eprintln!("error: second argument not an integer");
                    help();
                    return;
                },
            };
            // parse the command
            match &cmd[..] {
                "increase" => increase(number),
                "decrease" => decrease(number),
                _ => {
                    eprintln!("error: invalid command");
                    help();
                },
            }
        },
        // all the other cases
        _ => {
            // show a help message
            help();
        }
    }
}
$ ./match_args Rust
This is not the answer.
$ ./match_args 42
This is the answer!
$ ./match_args do something
error: second argument not an integer
usage:
match_args <string>
    Check whether given string is the answer.
match_args {increase|decrease} <integer>
    Increase or decrease given integer by one.
$ ./match_args do 42
error: invalid command
usage:
match_args <string>
    Check whether given string is the answer.
match_args {increase|decrease} <integer>
    Increase or decrease given integer by one.
$ ./match_args increase 42
43

Foreign Function Interface

Rust provides a Foreign Function Interface (FFI) to C libraries. Foreign functions must be declared inside an extern block annotated with a #[link] attribute containing the name of the foreign library.

use std::fmt;

// this extern block links to the libm library
#[link(name = "m")]
extern {
    // this is a foreign function
    // that computes the square root of a single precision complex number
    fn csqrtf(z: Complex) -> Complex;

    fn ccosf(z: Complex) -> Complex;
}

// Since calling foreign functions is considered unsafe,
// it's common to write safe wrappers around them.
fn cos(z: Complex) -> Complex {
    unsafe { ccosf(z) }
}

fn main() {
    // z = -1 + 0i
    let z = Complex { re: -1., im: 0. };

    // calling a foreign function is an unsafe operation
    let z_sqrt = unsafe { csqrtf(z) };

    println!("the square root of {:?} is {:?}", z, z_sqrt);

    // calling safe API wrapped around unsafe operation
    println!("cos({:?}) = {:?}", z, cos(z));
}

// Minimal implementation of single precision complex numbers
#[repr(C)]
#[derive(Clone, Copy)]
struct Complex {
    re: f32,
    im: f32,
}

impl fmt::Debug for Complex {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        if self.im < 0. {
            write!(f, "{}-{}i", self.re, -self.im)
        } else {
            write!(f, "{}+{}i", self.re, self.im)
        }
    }
}

Testing

Rust is a programming language that cares a lot about correctness and it includes support for writing software tests within the language itself.

Testing comes in three styles:

Also Rust has support for specifying additional dependencies for tests:

See Also

Unit testing

Tests are Rust functions that verify that the non-test code is functioning in the expected manner. The bodies of test functions typically perform some setup, run the code we want to test, then assert whether the results are what we expect.

Most unit tests go into a tests mod with the #[cfg(test)] attribute. Test functions are marked with the #[test] attribute.

Tests fail when something in the test function panics. There are some helper macros:

  • assert!(expression) - panics if expression evaluates to false.
  • assert_eq!(left, right) and assert_ne!(left, right) - testing left and right expressions for equality and inequality respectively.
pub fn add(a: i32, b: i32) -> i32 {
    a + b
}

// This is a really bad adding function, its purpose is to fail in this
// example.
#[allow(dead_code)]
fn bad_add(a: i32, b: i32) -> i32 {
    a - b
}

#[cfg(test)]
mod tests {
    // Note this useful idiom: importing names from outer (for mod tests) scope.
    use super::*;

    #[test]
    fn test_add() {
        assert_eq!(add(1, 2), 3);
    }

    #[test]
    fn test_bad_add() {
        // This assert would fire and test will fail.
        // Please note, that private functions can be tested too!
        assert_eq!(bad_add(1, 2), 3);
    }
}

Tests can be run with cargo test.

$ cargo test

running 2 tests
test tests::test_bad_add ... FAILED
test tests::test_add ... ok

failures:

---- tests::test_bad_add stdout ----
        thread 'tests::test_bad_add' panicked at 'assertion failed: `(left == right)`
  left: `-1`,
 right: `3`', src/lib.rs:21:8
note: Run with `RUST_BACKTRACE=1` for a backtrace.


failures:
    tests::test_bad_add

test result: FAILED. 1 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out

Tests and ?

None of the previous unit test examples had a return type. But in Rust 2018, your unit tests can return Result<()>, which lets you use ? in them! This can make them much more concise.

fn sqrt(number: f64) -> Result<f64, String> {
    if number >= 0.0 {
        Ok(number.powf(0.5))
    } else {
        Err("negative floats don't have square roots".to_owned())
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_sqrt() -> Result<(), String> {
        let x = 4.0;
        assert_eq!(sqrt(x)?.powf(2.0), x);
        Ok(())
    }
}

See "The Edition Guide" for more details.

Testing panics

To check functions that should panic under certain circumstances, use attribute #[should_panic]. This attribute accepts optional parameter expected = with the text of the panic message. If your function can panic in multiple ways, it helps make sure your test is testing the correct panic.

pub fn divide_non_zero_result(a: u32, b: u32) -> u32 {
    if b == 0 {
        panic!("Divide-by-zero error");
    } else if a < b {
        panic!("Divide result is zero");
    }
    a / b
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_divide() {
        assert_eq!(divide_non_zero_result(10, 2), 5);
    }

    #[test]
    #[should_panic]
    fn test_any_panic() {
        divide_non_zero_result(1, 0);
    }

    #[test]
    #[should_panic(expected = "Divide result is zero")]
    fn test_specific_panic() {
        divide_non_zero_result(1, 10);
    }
}

Running these tests gives us:

$ cargo test

running 3 tests
test tests::test_any_panic ... ok
test tests::test_divide ... ok
test tests::test_specific_panic ... ok

test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

   Doc-tests tmp-test-should-panic

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Running specific tests

To run specific tests one may specify the test name to cargo test command.

$ cargo test test_any_panic
running 1 test
test tests::test_any_panic ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 2 filtered out

   Doc-tests tmp-test-should-panic

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

To run multiple tests one may specify part of a test name that matches all the tests that should be run.

$ cargo test panic
running 2 tests
test tests::test_any_panic ... ok
test tests::test_specific_panic ... ok

test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 1 filtered out

   Doc-tests tmp-test-should-panic

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Ignoring tests

Tests can be marked with the #[ignore] attribute to exclude some tests. Or to run them with command cargo test -- --ignored

#![allow(unused)]
fn main() {
pub fn add(a: i32, b: i32) -> i32 {
    a + b
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_add() {
        assert_eq!(add(2, 2), 4);
    }

    #[test]
    fn test_add_hundred() {
        assert_eq!(add(100, 2), 102);
        assert_eq!(add(2, 100), 102);
    }

    #[test]
    #[ignore]
    fn ignored_test() {
        assert_eq!(add(0, 0), 0);
    }
}
}
$ cargo test
running 3 tests
test tests::ignored_test ... ignored
test tests::test_add ... ok
test tests::test_add_hundred ... ok

test result: ok. 2 passed; 0 failed; 1 ignored; 0 measured; 0 filtered out

   Doc-tests tmp-ignore

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

$ cargo test -- --ignored
running 1 test
test tests::ignored_test ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

   Doc-tests tmp-ignore

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Documentation testing

The primary way of documenting a Rust project is through annotating the source code. Documentation comments are written in CommonMark Markdown specification and support code blocks in them. Rust takes care about correctness, so these code blocks are compiled and used as documentation tests.

/// First line is a short summary describing function.
///
/// The next lines present detailed documentation. Code blocks start with
/// triple backquotes and have implicit `fn main()` inside
/// and `extern crate <cratename>`. Assume we're testing `doccomments` crate:
///
/// ```
/// let result = doccomments::add(2, 3);
/// assert_eq!(result, 5);
/// ```
pub fn add(a: i32, b: i32) -> i32 {
    a + b
}

/// Usually doc comments may include sections "Examples", "Panics" and "Failures".
///
/// The next function divides two numbers.
///
/// # Examples
///
/// ```
/// let result = doccomments::div(10, 2);
/// assert_eq!(result, 5);
/// ```
///
/// # Panics
///
/// The function panics if the second argument is zero.
///
/// ```rust,should_panic
/// // panics on division by zero
/// doccomments::div(10, 0);
/// ```
pub fn div(a: i32, b: i32) -> i32 {
    if b == 0 {
        panic!("Divide-by-zero error");
    }

    a / b
}

Code blocks in documentation are automatically tested when running the regular cargo test command:

$ cargo test
running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

   Doc-tests doccomments

running 3 tests
test src/lib.rs - add (line 7) ... ok
test src/lib.rs - div (line 21) ... ok
test src/lib.rs - div (line 31) ... ok

test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Motivation behind documentation tests

The main purpose of documentation tests is to serve as examples that exercise the functionality, which is one of the most important guidelines. It allows using examples from docs as complete code snippets. But using ? makes compilation fail since main returns unit. The ability to hide some source lines from documentation comes to the rescue: one may write fn try_main() -> Result<(), ErrorType>, hide it and unwrap it in hidden main. Sounds complicated? Here's an example:

/// Using hidden `try_main` in doc tests.
///
/// ```
/// # // hidden lines start with `#` symbol, but they're still compilable!
/// # fn try_main() -> Result<(), String> { // line that wraps the body shown in doc
/// let res = doccomments::try_div(10, 2)?;
/// # Ok(()) // returning from try_main
/// # }
/// # fn main() { // starting main that'll unwrap()
/// #    try_main().unwrap(); // calling try_main and unwrapping
/// #                         // so that test will panic in case of error
/// # }
/// ```
pub fn try_div(a: i32, b: i32) -> Result<i32, String> {
    if b == 0 {
        Err(String::from("Divide-by-zero"))
    } else {
        Ok(a / b)
    }
}

See Also

Integration testing

Unit tests are testing one module in isolation at a time: they're small and can test private code. Integration tests are external to your crate and use only its public interface in the same way any other code would. Their purpose is to test that many parts of your library work correctly together.

Cargo looks for integration tests in tests directory next to src.

File src/lib.rs:

// Define this in a crate called `adder`.
pub fn add(a: i32, b: i32) -> i32 {
    a + b
}

File with test: tests/integration_test.rs:

#[test]
fn test_add() {
    assert_eq!(adder::add(3, 2), 5);
}

Running tests with cargo test command:

$ cargo test
running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

     Running target/debug/deps/integration_test-bcd60824f5fbfe19

running 1 test
test test_add ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

   Doc-tests adder

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Each Rust source file in the tests directory is compiled as a separate crate. In order to share some code between integration tests we can make a module with public functions, importing and using it within tests.

File tests/common/mod.rs:

pub fn setup() {
    // some setup code, like creating required files/directories, starting
    // servers, etc.
}

File with test: tests/integration_test.rs

// importing common module.
mod common;

#[test]
fn test_add() {
    // using common code.
    common::setup();
    assert_eq!(adder::add(3, 2), 5);
}

Creating the module as tests/common.rs also works, but is not recommended because the test runner will treat the file as a test crate and try to run tests inside it.

Development dependencies

Sometimes there is a need to have dependencies for tests (or examples, or benchmarks) only. Such dependencies are added to Cargo.toml in the [dev-dependencies] section. These dependencies are not propagated to other packages which depend on this package.

One such example is pretty_assertions, which extends standard assert_eq! and assert_ne! macros, to provide colorful diff.
File Cargo.toml:

# standard crate data is left out
[dev-dependencies]
pretty_assertions = "1"

File src/lib.rs:

pub fn add(a: i32, b: i32) -> i32 {
    a + b
}

#[cfg(test)]
mod tests {
    use super::*;
    use pretty_assertions::assert_eq; // crate for test-only use. Cannot be used in non-test code.

    #[test]
    fn test_add() {
        assert_eq!(add(2, 3), 5);
    }
}

See Also

Cargo docs on specifying dependencies.

Unsafe Operations

As an introduction to this section, to borrow from the official docs, "one should try to minimize the amount of unsafe code in a code base." With that in mind, let's get started! Unsafe annotations in Rust are used to bypass protections put in place by the compiler; specifically, there are four primary things that unsafe is used for:

  • dereferencing raw pointers
  • calling functions or methods which are unsafe (including calling a function over FFI, see a previous chapter of the book)
  • accessing or modifying static mutable variables
  • implementing unsafe traits

Raw Pointers

Raw pointers * and references &T function similarly, but references are always safe because they are guaranteed to point to valid data due to the borrow checker. Dereferencing a raw pointer can only be done through an unsafe block.

fn main() {
    let raw_p: *const u32 = &10;

    unsafe {
        assert!(*raw_p == 10);
    }
}

Calling Unsafe Functions

Some functions can be declared as unsafe, meaning it is the programmer's responsibility to ensure correctness instead of the compiler's. One example of this is std::slice::from_raw_parts which will create a slice given a pointer to the first element and a length:

use std::slice;

fn main() {
    let some_vector = vec![1, 2, 3, 4];

    let pointer = some_vector.as_ptr();
    let length = some_vector.len();

    unsafe {
        let my_slice: &[u32] = slice::from_raw_parts(pointer, length);

        assert_eq!(some_vector.as_slice(), my_slice);
    }
}

For slice::from_raw_parts, one of the assumptions which must be upheld is that the pointer passed in points to valid memory and that the memory pointed to is of the correct type. If these invariants aren't upheld then the program's behaviour is undefined and there is no knowing what will happen.

Inline assembly

Rust provides support for inline assembly via the asm! macro. It can be used to embed handwritten assembly in the assembly output generated by the compiler. Generally this should not be necessary, but might be where the required performance or timing cannot be otherwise achieved. Accessing low level hardware primitives, e.g. in kernel code, may also demand this functionality.

Note: the examples here are given in x86/x86-64 assembly, but other architectures are also supported.

Inline assembly is currently supported on the following architectures:

  • x86 and x86-64
  • ARM
  • AArch64
  • RISC-V

Basic usage

Let us start with the simplest possible example:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

unsafe {
    asm!("nop");
}
}
}

This will insert a NOP (no operation) instruction into the assembly generated by the compiler. Note that all asm! invocations have to be inside an unsafe block, as they could insert arbitrary instructions and break various invariants. The instructions to be inserted are listed in the first argument of the asm! macro as a string literal.

Inputs and outputs

Now inserting an instruction that does nothing is rather boring. Let us do something that actually acts on data:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

let x: u64;
unsafe {
    asm!("mov {}, 5", out(reg) x);
}
assert_eq!(x, 5);
}
}

This will write the value 5 into the u64 variable x. You can see that the string literal we use to specify instructions is actually a template string. It is governed by the same rules as Rust format strings. The arguments that are inserted into the template however look a bit different than you may be familiar with. First we need to specify if the variable is an input or an output of the inline assembly. In this case it is an output. We declared this by writing out. We also need to specify in what kind of register the assembly expects the variable. In this case we put it in an arbitrary general purpose register by specifying reg. The compiler will choose an appropriate register to insert into the template and will read the variable from there after the inline assembly finishes executing.

Let us see another example that also uses an input:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

let i: u64 = 3;
let o: u64;
unsafe {
    asm!(
        "mov {0}, {1}",
        "add {0}, 5",
        out(reg) o,
        in(reg) i,
    );
}
assert_eq!(o, 8);
}
}

This will add 5 to the input in variable i and write the result to variable o. The particular way this assembly does this is first copying the value from i to the output, and then adding 5 to it.

The example shows a few things:

First, we can see that asm! allows multiple template string arguments; each one is treated as a separate line of assembly code, as if they were all joined together with newlines between them. This makes it easy to format assembly code.

Second, we can see that inputs are declared by writing in instead of out.

Third, we can see that we can specify an argument number, or name as in any format string. For inline assembly templates this is particularly useful as arguments are often used more than once. For more complex inline assembly using this facility is generally recommended, as it improves readability, and allows reordering instructions without changing the argument order.

We can further refine the above example to avoid the mov instruction:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

let mut x: u64 = 3;
unsafe {
    asm!("add {0}, 5", inout(reg) x);
}
assert_eq!(x, 8);
}
}

We can see that inout is used to specify an argument that is both input and output. This is different from specifying an input and output separately in that it is guaranteed to assign both to the same register.

It is also possible to specify different variables for the input and output parts of an inout operand:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

let x: u64 = 3;
let y: u64;
unsafe {
    asm!("add {0}, 5", inout(reg) x => y);
}
assert_eq!(y, 8);
}
}

Late output operands

The Rust compiler is conservative with its allocation of operands. It is assumed that an out can be written at any time, and can therefore not share its location with any other argument. However, to guarantee optimal performance it is important to use as few registers as possible, so they won't have to be saved and reloaded around the inline assembly block. To achieve this Rust provides a lateout specifier. This can be used on any output that is written only after all inputs have been consumed. There is also a inlateout variant of this specifier.

Here is an example where inlateout cannot be used in release mode or other optimized cases:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

let mut a: u64 = 4;
let b: u64 = 4;
let c: u64 = 4;
unsafe {
    asm!(
        "add {0}, {1}",
        "add {0}, {2}",
        inout(reg) a,
        in(reg) b,
        in(reg) c,
    );
}
assert_eq!(a, 12);
}
}

The above could work well in unoptimized cases (Debug mode), but if you want optimized performance (release mode or other optimized cases), it could not work.

That is because in optimized cases, the compiler is free to allocate the same register for inputs b and c since it knows they have the same value. However it must allocate a separate register for a since it uses inout and not inlateout. If inlateout was used, then a and c could be allocated to the same register, in which case the first instruction to overwrite the value of c and cause the assembly code to produce the wrong result.

However the following example can use inlateout since the output is only modified after all input registers have been read:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

let mut a: u64 = 4;
let b: u64 = 4;
unsafe {
    asm!("add {0}, {1}", inlateout(reg) a, in(reg) b);
}
assert_eq!(a, 8);
}
}

As you can see, this assembly fragment will still work correctly if a and b are assigned to the same register.

Explicit register operands

Some instructions require that the operands be in a specific register. Therefore, Rust inline assembly provides some more specific constraint specifiers. While reg is generally available on any architecture, explicit registers are highly architecture specific. E.g. for x86 the general purpose registers eax, ebx, ecx, edx, ebp, esi, and edi among others can be addressed by their name.

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

let cmd = 0xd1;
unsafe {
    asm!("out 0x64, eax", in("eax") cmd);
}
}
}

In this example we call the out instruction to output the content of the cmd variable to port 0x64. Since the out instruction only accepts eax (and its sub registers) as operand we had to use the eax constraint specifier.

Note: unlike other operand types, explicit register operands cannot be used in the template string: you can't use {} and should write the register name directly instead. Also, they must appear at the end of the operand list after all other operand types.

Consider this example which uses the x86 mul instruction:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

fn mul(a: u64, b: u64) -> u128 {
    let lo: u64;
    let hi: u64;

    unsafe {
        asm!(
            // The x86 mul instruction takes rax as an implicit input and writes
            // the 128-bit result of the multiplication to rax:rdx.
            "mul {}",
            in(reg) a,
            inlateout("rax") b => lo,
            lateout("rdx") hi
        );
    }

    ((hi as u128) << 64) + lo as u128
}
}
}

This uses the mul instruction to multiply two 64-bit inputs with a 128-bit result. The only explicit operand is a register, that we fill from the variable a. The second operand is implicit, and must be the rax register, which we fill from the variable b. The lower 64 bits of the result are stored in rax from which we fill the variable lo. The higher 64 bits are stored in rdx from which we fill the variable hi.

Clobbered registers

In many cases inline assembly will modify state that is not needed as an output. Usually this is either because we have to use a scratch register in the assembly or because instructions modify state that we don't need to further examine. This state is generally referred to as being "clobbered". We need to tell the compiler about this since it may need to save and restore this state around the inline assembly block.

use std::arch::asm;

#[cfg(target_arch = "x86_64")]
fn main() {
    // three entries of four bytes each
    let mut name_buf = [0_u8; 12];
    // String is stored as ascii in ebx, edx, ecx in order
    // Because ebx is reserved, the asm needs to preserve the value of it.
    // So we push and pop it around the main asm.
    // (in 64 bit mode for 64 bit processors, 32 bit processors would use ebx)

    unsafe {
        asm!(
            "push rbx",
            "cpuid",
            "mov [rdi], ebx",
            "mov [rdi + 4], edx",
            "mov [rdi + 8], ecx",
            "pop rbx",
            // We use a pointer to an array for storing the values to simplify
            // the Rust code at the cost of a couple more asm instructions
            // This is more explicit with how the asm works however, as opposed
            // to explicit register outputs such as `out("ecx") val`
            // The *pointer itself* is only an input even though it's written behind
            in("rdi") name_buf.as_mut_ptr(),
            // select cpuid 0, also specify eax as clobbered
            inout("eax") 0 => _,
            // cpuid clobbers these registers too
            out("ecx") _,
            out("edx") _,
        );
    }

    let name = core::str::from_utf8(&name_buf).unwrap();
    println!("CPU Manufacturer ID: {}", name);
}

#[cfg(not(target_arch = "x86_64"))]
fn main() {}

In the example above we use the cpuid instruction to read the CPU manufacturer ID. This instruction writes to eax with the maximum supported cpuid argument and ebx, edx, and ecx with the CPU manufacturer ID as ASCII bytes in that order.

Even though eax is never read we still need to tell the compiler that the register has been modified so that the compiler can save any values that were in these registers before the asm. This is done by declaring it as an output but with _ instead of a variable name, which indicates that the output value is to be discarded.

This code also works around the limitation that ebx is a reserved register by LLVM. That means that LLVM assumes that it has full control over the register and it must be restored to its original state before exiting the asm block, so it cannot be used as an input or output except if the compiler uses it to fulfill a general register class (e.g. in(reg)). This makes reg operands dangerous when using reserved registers as we could unknowingly corrupt our input or output because they share the same register.

To work around this we use rdi to store the pointer to the output array, save ebx via push, read from ebx inside the asm block into the array and then restore ebx to its original state via pop. The push and pop use the full 64-bit rbx version of the register to ensure that the entire register is saved. On 32 bit targets the code would instead use ebx in the push/pop.

This can also be used with a general register class to obtain a scratch register for use inside the asm code:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

// Multiply x by 6 using shifts and adds
let mut x: u64 = 4;
unsafe {
    asm!(
        "mov {tmp}, {x}",
        "shl {tmp}, 1",
        "shl {x}, 2",
        "add {x}, {tmp}",
        x = inout(reg) x,
        tmp = out(reg) _,
    );
}
assert_eq!(x, 4 * 6);
}
}

Symbol operands and ABI clobbers

By default, asm! assumes that any register not specified as an output will have its contents preserved by the assembly code. The clobber_abi argument to asm! tells the compiler to automatically insert the necessary clobber operands according to the given calling convention ABI: any register which is not fully preserved in that ABI will be treated as clobbered. Multiple clobber_abi arguments may be provided and all clobbers from all specified ABIs will be inserted.

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

extern "C" fn foo(arg: i32) -> i32 {
    println!("arg = {}", arg);
    arg * 2
}

fn call_foo(arg: i32) -> i32 {
    unsafe {
        let result;
        asm!(
            "call {}",
            // Function pointer to call
            in(reg) foo,
            // 1st argument in rdi
            in("rdi") arg,
            // Return value in rax
            out("rax") result,
            // Mark all registers which are not preserved by the "C" calling
            // convention as clobbered.
            clobber_abi("C"),
        );
        result
    }
}
}
}

Register template modifiers

In some cases, fine control is needed over the way a register name is formatted when inserted into the template string. This is needed when an architecture's assembly language has several names for the same register, each typically being a "view" over a subset of the register (e.g. the low 32 bits of a 64-bit register).

By default the compiler will always choose the name that refers to the full register size (e.g. rax on x86-64, eax on x86, etc).

This default can be overridden by using modifiers on the template string operands, just like you would with format strings:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

let mut x: u16 = 0xab;

unsafe {
    asm!("mov {0:h}, {0:l}", inout(reg_abcd) x);
}

assert_eq!(x, 0xabab);
}
}

In this example, we use the reg_abcd register class to restrict the register allocator to the 4 legacy x86 registers (ax, bx, cx, dx) of which the first two bytes can be addressed independently.

Let us assume that the register allocator has chosen to allocate x in the ax register. The h modifier will emit the register name for the high byte of that register and the l modifier will emit the register name for the low byte. The asm code will therefore be expanded as mov ah, al which copies the low byte of the value into the high byte.

If you use a smaller data type (e.g. u16) with an operand and forget to use template modifiers, the compiler will emit a warning and suggest the correct modifier to use.

Memory address operands

Sometimes assembly instructions require operands passed via memory addresses/memory locations. You have to manually use the memory address syntax specified by the target architecture. For example, on x86/x86_64 using Intel assembly syntax, you should wrap inputs/outputs in [] to indicate they are memory operands:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

fn load_fpu_control_word(control: u16) {
    unsafe {
        asm!("fldcw [{}]", in(reg) &control, options(nostack));
    }
}
}
}

Labels

Any reuse of a named label, local or otherwise, can result in an assembler or linker error or may cause other strange behavior. Reuse of a named label can happen in a variety of ways including:

  • explicitly: using a label more than once in one asm! block, or multiple times across blocks.
  • implicitly via inlining: the compiler is allowed to instantiate multiple copies of an asm! block, for example when the function containing it is inlined in multiple places.
  • implicitly via LTO: LTO can cause code from other crates to be placed in the same codegen unit, and so could bring in arbitrary labels.

As a consequence, you should only use GNU assembler numeric local labels inside inline assembly code. Defining symbols in assembly code may lead to assembler and/or linker errors due to duplicate symbol definitions.

Moreover, on x86 when using the default Intel syntax, due to an LLVM bug, you shouldn't use labels exclusively made of 0 and 1 digits, e.g. 0, 11 or 101010, as they may end up being interpreted as binary values. Using options(att_syntax) will avoid any ambiguity, but that affects the syntax of the entire asm! block. (See Options, below, for more on options.)

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

let mut a = 0;
unsafe {
    asm!(
        "mov {0}, 10",
        "2:",
        "sub {0}, 1",
        "cmp {0}, 3",
        "jle 2f",
        "jmp 2b",
        "2:",
        "add {0}, 2",
        out(reg) a
    );
}
assert_eq!(a, 5);
}
}

This will decrement the {0} register value from 10 to 3, then add 2 and store it in a.

This example shows a few things:

  • First, that the same number can be used as a label multiple times in the same inline block.
  • Second, that when a numeric label is used as a reference (as an instruction operand, for example), the suffixes “b” (“backward”) or ”f” (“forward”) should be added to the numeric label. It will then refer to the nearest label defined by this number in this direction.

Options

By default, an inline assembly block is treated the same way as an external FFI function call with a custom calling convention: it may read/write memory, have observable side effects, etc. However, in many cases it is desirable to give the compiler more information about what the assembly code is actually doing so that it can optimize better.

Let's take our previous example of an add instruction:

#![allow(unused)]
fn main() {
#[cfg(target_arch = "x86_64")] {
use std::arch::asm;

let mut a: u64 = 4;
let b: u64 = 4;
unsafe {
    asm!(
        "add {0}, {1}",
        inlateout(reg) a, in(reg) b,
        options(pure, nomem, nostack),
    );
}
assert_eq!(a, 8);
}
}

Options can be provided as an optional final argument to the asm! macro. We specified three options here:

  • pure means that the asm code has no observable side effects and that its output depends only on its inputs. This allows the compiler optimizer to call the inline asm fewer times or even eliminate it entirely.
  • nomem means that the asm code does not read or write to memory. By default the compiler will assume that inline assembly can read or write any memory address that is accessible to it (e.g. through a pointer passed as an operand, or a global).
  • nostack means that the asm code does not push any data onto the stack. This allows the compiler to use optimizations such as the stack red zone on x86-64 to avoid stack pointer adjustments.

These allow the compiler to better optimize code using asm!, for example by eliminating pure asm! blocks whose outputs are not needed.

See the reference for the full list of available options and their effects.

Compatibility

The Rust language is fastly evolving, and because of this certain compatibility issues can arise, despite efforts to ensure forwards-compatibility wherever possible.

Raw identifiers

Rust, like many programming languages, has the concept of "keywords". These identifiers mean something to the language, and so you cannot use them in places like variable names, function names, and other places. Raw identifiers let you use keywords where they would not normally be allowed. This is particularly useful when Rust introduces new keywords, and a library using an older edition of Rust has a variable or function with the same name as a keyword introduced in a newer edition.

For example, consider a crate foo compiled with the 2015 edition of Rust that exports a function named try. This keyword is reserved for a new feature in the 2018 edition, so without raw identifiers, we would have no way to name the function.

extern crate foo;

fn main() {
    foo::try();
}

You'll get this error:

error: expected identifier, found keyword `try`
 --> src/main.rs:4:4
  |
4 | foo::try();
  |      ^^^ expected identifier, found keyword

You can write this with a raw identifier:

extern crate foo;

fn main() {
    foo::r#try();
}

Meta

Some topics aren't exactly relevant to how you program but provide you tooling or infrastructure support which just makes things better for everyone. These topics include:

  • Documentation: Generate library documentation for users via the included rustdoc.
  • Playground: Integrate the Rust Playground in your documentation.

Documentation

Use cargo doc to build documentation in target/doc.

Use cargo test to run all tests (including documentation tests), and cargo test --doc to only run documentation tests.

These commands will appropriately invoke rustdoc (and rustc) as required.

Doc comments

Doc comments are very useful for big projects that require documentation. When running rustdoc, these are the comments that get compiled into documentation. They are denoted by a ///, and support Markdown.

#![crate_name = "doc"]

/// A human being is represented here
pub struct Person {
    /// A person must have a name, no matter how much Juliet may hate it
    name: String,
}

impl Person {
    /// Returns a person with the name given them
    ///
    /// # Arguments
    ///
    /// * `name` - A string slice that holds the name of the person
    ///
    /// # Examples
    ///
    /// ```
    /// // You can have rust code between fences inside the comments
    /// // If you pass --test to `rustdoc`, it will even test it for you!
    /// use doc::Person;
    /// let person = Person::new("name");
    /// ```
    pub fn new(name: &str) -> Person {
        Person {
            name: name.to_string(),
        }
    }

    /// Gives a friendly hello!
    ///
    /// Says "Hello, [name](Person::name)" to the `Person` it is called on.
    pub fn hello(& self) {
        println!("Hello, {}!", self.name);
    }
}

fn main() {
    let john = Person::new("John");

    john.hello();
}

To run the tests, first build the code as a library, then tell rustdoc where to find the library so it can link it into each doctest program:

$ rustc doc.rs --crate-type lib
$ rustdoc --test --extern doc="libdoc.rlib" doc.rs

Doc attributes

Below are a few examples of the most common #[doc] attributes used with rustdoc.

inline

Used to inline docs, instead of linking out to separate page.

#[doc(inline)]
pub use bar::Bar;

/// bar docs
mod bar {
    /// the docs for Bar
    pub struct Bar;
}

no_inline

Used to prevent linking out to separate page or anywhere.

// Example from libcore/prelude
#[doc(no_inline)]
pub use crate::mem::drop;

hidden

Using this tells rustdoc not to include this in documentation:

// Example from the futures-rs library
#[doc(hidden)]
pub use self::async_await::*;

For documentation, rustdoc is widely used by the community. It's what is used to generate the std library docs.

See also:

Playground

The Rust Playground is a way to experiment with Rust code through a web interface.

Using it with mdbook

In mdbook, you can make code examples playable and editable.

fn main() {
    println!("Hello World!");
}

This allows the reader to both run your code sample, but also modify and tweak it. The key here is the adding the word editable to your codefence block separated by a comma.

```rust,editable
//...place your code here
```

Additionally, you can add ignore if you want mdbook to skip your code when it builds and tests.

```rust,editable,ignore
//...place your code here
```

Using it with docs

You may have noticed in some of the official Rust docs a button that says "Run", which opens the code sample up in a new tab in Rust Playground. This feature is enabled if you use the #[doc] attribute called html_playground_url.

See also: