Rust com Exemplos
Rust é uma linguagem de programação de sistemas moderna focada na segurança, velocidade e concorrência. Ela cumpre estes objetivos sendo segura quanto ao uso da memória sem usar recolha de lixo.
Rust com Exemplos é uma coleção de exemplos executáveis que ilustram vários conceitos da Rust e bibliotecas padrão. Para obter ainda mais destes exemplos, não te esqueças de instalar a Rust localmente e consulte a documentação oficial. Adicionalmente para os curiosos, também podes consultar o código-fonte para este local.
Agora começaremos!
-
Olá Mundo - Comece com um programa "Olá Mundo" tradicional.
-
Primitivos - Estude os inteiros com sinal, inteiros sem sinal e outros primitivos.
-
Tipos Personalizados -
struct
eenum
. -
Vínculos de Variável - vínculos mutáveis, âmbito, obscurecimento.
-
Tipos - Estude as mudanças e definição de tipos.
-
Controlo de Fluxo -
if
/else
,for
, e outros. -
Funções - Estude os Métodos, Fechamentos e Funções de Ordem Superior.
-
Módulos - Organize o código usando módulos.
-
Caixote - Um caixote é uma unidade de compilação na Rust. Aprenda a criar uma biblioteca.
-
Cargo - Passe por algumas funcionalidades básicas da ferramenta de gestão de pacote da Rust oficial.
-
Atributos - Um atributo é um meta-dado aplicado à algum módulo, caixote ou item.
-
Genéricos - Aprenda a escrever uma função ou tipo de dado que pode trabalhar com vários tipos de argumentos.
-
Regras de Delimitação do Âmbito - Os âmbitos desempenham um parte importante na posse, empréstimo, e vidas.
-
Características - Uma característica é uma coleção de métodos definidos por um tipo desconhecido:
Self
-
Manipulação de erro - Estude a maneira da Rust de lidar com falhas.
-
Tipos da Biblioteca
std
- Estude alguns tipos personalizados fornecidos pela bibliotecastd
. -
Outros
std
- Mais tipos personalizados para manipulação de ficheiro, linhas de processamento. -
Testes - Todos os tipos de testes na Rust.
-
Meta - Documentação, Analise Comparativa.
Olá Mundo
Este é o código-fonte do programa "Hello World" tradicional:
// Isto é um comentário, e é ignorado pelo compilador. // Podemos testar este código clicando o botão "Run" ou // se preferirmos usar o nosso teclado, podemos usar o atalho "Ctrl + Enter" // Este código é editável, sinta-se a vontade para alterá-lo! // Podemos sempre retornar para o código original clicando o botão "Reset" -> // Esta é a função principal. fn main() { // Declarações nesta função são executadas quando o binário for chamado. // Imprime o texto na consola. println!("Hello World!"); }
println!
é uma macro que imprime o texto na consola.
Um binário pode ser gerado usando o compilador da Rust: rustc
:
$ rustc hello.rs
rustc
produzirá um binário hello
que pode ser executado:
$ ./hello
Hello World!
Atividade
Clique sobre o 'Run' acima para veres a saída esperada. Depois, adicione uma nova linha com uma segunda macro println!
para que a saída exiba:
Hello World!
I'm a Rustacean!
Comentários
Qualquer programa requer comentários, e a Rust suporta algumas variedades diferentes:
- Comentários normais que são ignorados pelo compilador:
// Comentários de linha que vão até ao final da linha.
/* Comentários de bloco que vão até o delimitador de fechamento. */
- Comentários de documentação que são analisados para documentação da biblioteca em HTML:
/// Gera documentações da biblioteca para o seguinte item.
//! Gera as documentações da biblioteca para o item de fechamento.
fn main() { // Isto é um exemplo de comentário duma linha. // Existem duas barras no princípio da linha. // E nada escrito depois destas será lido pelo compilador. // println!("Hello, world!"); // Execute-o. Vês? Agora tente eliminar as duas barras, e execute-o novamente. /* * Isto é um outro tipo de comentário, um bloco de comentário. Em geral, * comentários de linha são o estilo de comentário recomendado. * Mas os comentários de bloco são extremamente úteis para * desativar temporariamente pedaços de código. * /* Comentários de bloco podem ser /* encaixados, */ */ então apenas custa * alguns toques na tecla para comentar tudo nesta função `main()`. * /*/*/* Experimente-o tu mesmo! */*/*/ */ /* Nota: a anterior coluna de `*` foi inteiramente para estilo. Não existe de fato necessidade disto. */ // Nós podemos manipular as expressões mais facilmente com comentários // de bloco do que com comentários de linha. // Tente eliminar os delimitadores de comentário para mudar o resultado: let x = 5 + /* 90 + */ 5; println!("Is `x` 10 or 100? x = {}", x); }
Consulte também:
Impressão Formatada
A impressão é manipulada por uma série de macros
definidas na std::fmt
, algumas das quais incluem:
format!
: escreve o texto formatado para aString
.print!
: o mesmo queformat!
mas o texto é imprimido para a consola (io::stdout
).println!
: o mesmo queprint!
mas uma nova linha é anexada.eprint!
: o mesmo queprint!
mas o texto é imprimido para o erro padrão (io::stderr
).eprintln!
: o mesmo queeprint!
mas uma nova linha é anexada.
Todos analisam o texto da mesma maneira. Como uma vantagem, a Rust verifica a exatidão da formatação em tempo de compilação:
fn main() { // Em geral, as `{}` serão automaticamente substituídas por quaisquer // argumentos. Estes serão convertidos em sequências de caracteres. println!("{} days", 31); // Os argumentos posicionais podem ser usados. Especificar um inteiro // dentro de `{}` determina qual argumento adicional será substituído. // Os argumentos começam em 0 imediatamente depois da // sequência de caracteres de formatação. println!("{0}, this is {1}. {1}, this is {0}", "Alice", "Bob"); // As can named arguments. println!("{subject} {verb} {object}", object="the lazy dog", subject="the quick brown fox", verb="jumps over"); // Uma formatação diferente pode ser invocada com a especificação do // carácter de formato depois dum `:`. println!("Base 10: {}", 69420); // 69420 println!("Base 2 (binary): {:b}", 69420); // 10000111100101100 println!("Base 8 (octal): {:o}", 69420); // 207454 println!("Base 16 (hexadecimal): {:x}", 69420); // 10f2c println!("Base 16 (hexadecimal): {:X}", 69420); // 10F2C // Podemos justificar o texto à direita com uma largura especificada. // Isto resultará em " 1". // (Quatro espaços em branco e um "1", para uma largura total de 5). println!("{number:>5}", number=1); // Podemos acolchoar números com zeros adicionais, println!("{number:0>5}", number=1); // 00001 // e ajustar à esquerda com a virada do sinal. Isto resultará em "10000". println!("{number:0<5}", number=1); // 10000 // Podemos usar argumentos nomeados no // especificador de formato anexando um `$`. println!("{number:0>width$}", number=1, width=5); // A Rust ainda verifica para certificar-se // que o número correto de argumentos são usados. println!("My name is {0}, {1} {0}", "Bond"); // FIXME ^ Add the missing argument: "James" // Apenas tipos que implementam `fmt::Display` podem ser formatados com `{}`. // Os tipos definidos pelo utilizador // não implementam `fmt::Display` por padrão. #[allow(dead_code)] // desativa `dead_code` que alerta sobre módulo não usado. struct Structure(i32); // Isto não compilará porque `Structure` não implementa `fmt::Display` // println!("This struct `{}` won't print...", Structure(3)); // TODO ^ Try uncommenting this line // Para a Rust 1.58 e acima, podemos capturar diretamente o argumento duma // variável envolvente. Tal como acima, isto resultará em " 1", // 4 espaços em branco e um "1". let number: f64 = 1.0; let width: usize = 5; println!("{number:>width$}"); }
O std::fmt
contém muitas traits
que governam a exibição do texto. A forma de base de dois importantes são listados abaixo:
fmt::Debug
: Usa o marcador{:?}
. Formata o texto para fins de depuração.fmt::Display
: Usa o marcador{}
. Formata o texto duma maneira mais elegante e agradável para o utilizador.
Cá, usamos fmt::Display
porque a biblioteca std
fornece implementações para estes tipos. Para imprimir texto para tipos personalizados, mais passos são necessários.
Implementar a característica fmt::Display
implementa automaticamente a característica ToString
que permite-nos converter o tipo para String
.
Na linha 46, #[allow(dead_code)]
é um atributo que apenas aplica-se ao módulo depois deste.
Atividades
- Corrija o problema no código acima (vê a FIXME) para que esta execute sem erro.
- Tente desfazer o comentário da linha que tenta formatar a estrutura
Structure
(vê TODO) - Adicione uma chamada de macro
println!
que imprime:Pi is roughly 3.142
controlando o número de casas decimais exibido. Para os fins deste exercício, uselet pi = 3.141592
como uma estimativa para o pi. (Dica: podemos precisar de consultar a documentação destd::fmt
para definir o número de decimais à exibir).
Consulte também:
std::fmt
, macros
, struct
, traits
, e dead_code
.
Depuração
Todos os tipos que quiserem usar traits
de formatação de std::fmt
exigem que uma implementação seja imprimível. Implementações automáticas são apenas fornecidas para tipos tais como na biblioteca std
. Todos os outros devem ser manualmente implementados de alguma maneira.
A trait
de fmt::Debug
torna isto muito direto. Todos os tipos podem derive
(criar automaticamente) a implementação fmt::Debug
. Isto não é verdadeiro para fmt::Display
que deve ser implementada manualmente:
#![allow(unused)] fn main() { // Esta estrutura não pode ser impressa nem com `fmt::Display` ou // com `fmt::Debug`. struct UnPrintable(i32); // O atributo `derive` cria automaticamente a implementação exigida // para tornar esta `struct` imprimível com `fmt::Debug`. #[derive(Debug)] struct DebugPrintable(i32); }
Todos os tipos da biblioteca std
são automaticamente imprimíveis também com {:?}
:
// Derivar a implementação `fmt::Debug` para `Structure`. // `Structure` é uma estrutura que contém um único `i32`. #[derive(Debug)] struct Structure(i32); // Colocar uma `Structure` dentro da estrutura `Deep`. // Também a torna imprimível. #[derive(Debug)] struct Deep(Structure); fn main() { // Imprimir com `{:?}` é semelhante à com `{}`. println!("{:?} months in a year.", 12); println!("{1:?} {0:?} is the {actor:?} name.", "Slater", "Christian", actor="actor's"); // `Structure` é imprimível! println!("Now {:?} will print!", Structure(3)); // O problema com `derive` é que não existe controlo sobre // a aparência do resultado. E se quisermos que isto só mostre um `7`? println!("Now {:?} will print!", Deep(Structure(7))); }
Assim fmt::Debug
definitivamente torna isto imprimível mas sacrifica alguma elegância. A Rust também fornece "impressão elegante" com {:#?}
:
#[derive(Debug)] struct Person<'a> { name: &'a str, age: u8 } fn main() { let name = "Peter"; let age = 27; let peter = Person { name, age }; // Imprimir com elegância println!("{:#?}", peter); }
Um pode manualmente implementar fmt::Display
para controlar a exibição.
Consulte também:
attributes
, derive
, std::fmt
,
e struct
Exibição
fmt::Debug
dificilmente parece compacto e limpo, então é frequentemente vantajoso personalizar a aparência da saída. Isto é feito implementando manualmente a fmt::Display
, que usa o marcador de impressão {}
. Sua implementação parece-se com:
#![allow(unused)] fn main() { // Importar (através de `use`) o módulo `fmt` // para torná-lo disponível. use std::fmt; // Definir uma estrutura para qual `fmt::Display` será implementado. // Isto é uma estrutura de tupla nomeada `Structure` que contém um `i32`. struct Structure(i32); // Para usar o marcador `{}`, a característica `fmt::Display` // deve ser implementada manualmente para o tipo. impl fmt::Display for Structure { // Esta característica exige `fmt` com esta exata assinatura. fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { // Escrever estritamente o primeiro elemento para a corrente // de saída fornecida: `f`. Retorna `fmt::Result` que indica se // a operação foi bem-sucedida ou falhou. Nota que `write!` usa // sintaxe que é muito semelhante ao `println!`. write!(f, "{}", self.0) } } }
fmt::Display
pode ser mais claro do que fmt::Debug
mas isto apresenta um problema para a biblioteca std
. Como deveriam os tipos ambíguos ser exibidos? Por exemplo, se a biblioteca std
implementasse um único estilo para todos os Vec<T>
, qual estilo seria? Seria qualquer destes dois?
Vec<path>
:/:/etc:/home/username:/bin
(separa-se nos:
)Vec<number>
:1,2,3
(separa-se nas,
)
Não, porque não existe estilo ideal para todos os tipos e a biblioteca std
não presume import uma. fmt::Display
não é implementada para Vec<T>
ou para quaisquer outros contentores genéricos. fmt::Debug
deve então ser usado para estes casos genéricos.
Isto não é um problema embora porque para qualquer tipo de contentor novo que não é genérico, fmt::Display
pode ser implementado:
use std::fmt; // Importar `fmt` // Uma estrutura segurando dois números. `Debug` será derivado assim os // resultados podem ser contrastados com `Display`. #[derive(Debug)] struct MinMax(i64, i64); // Implementar `Display` para `MinMax`. impl fmt::Display for MinMax { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { // Usar `self.number` para fazer referência à cada ponto de // dados posicional. write!(f, "({}, {})", self.0, self.1) } } // Definir uma estrutura onde os campos são nomeáveis para comparação. #[derive(Debug)] struct Point2D { x: f64, y: f64, } // De maneira semelhante, implementar `Display` para `Point2D`. impl fmt::Display for Point2D { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { // Personalizar, assim apenas `x` e `y` são denotados. write!(f, "x: {}, y: {}", self.x, self.y) } } fn main() { let minmax = MinMax(0, 14); println!("Compare structures:"); println!("Display: {}", minmax); println!("Debug: {:?}", minmax); let big_range = MinMax(-300, 300); let small_range = MinMax(-3, 3); println!("The big range is {big} and the small is {small}", small = small_range, big = big_range); let point = Point2D { x: 3.3, y: 7.2 }; println!("Compare points:"); println!("Display: {}", point); println!("Debug: {:?}", point); // Erro. Ambos `Debug` e `Display` foram implementados, mas `{:b}` // exige que `fmt::Binary` seja implementado. Isto não funcionará. // println!("What does Point2D look like in binary: {:b}?", point); }
Assim, fmt::Display
tem sido implementado mas fmt::Binary
não tem, e portanto não pode ser usado. std::fmt
tem vários tal como traits
e cada um exige sua própria implementação. Isto é detalhado mais adiante em std::fmt
.
Atividade
Depois de verificar a saída do exemplo acima, use a estrutura Point2D
como um guia para adicionar uma estrutura Complex
ao exemplo. Quando imprimida da mesma maneira, a saída deveria ser:
Display: 3.3 + 7.2i
Debug: Complex { real: 3.3, imag: 7.2 }
Consulte também:
derive
, std::fmt
, macros
, struct
,
trait
, e use
Caso de Teste: Lista
Implementar fmt::Display
para uma estrutura onde os elementos devem cada um ser manipulados sequencialmente é complicado. O problema é que cada write!
gera um fmt::Result
. A manipulação apropriada disto exige lidar com todos os resultados. A Rust fornece o operador ?
para exatamente este propósito.
Usar ?
na write!
parece-se com isto:
// Tentar `write!` para ver se erra. Se errar, retorne o erro.
// De outro modo continue.
write!(f, "{}", value)?;
Com ?
disponível, implementar fmt::Display
para uma Vec
é simples:
use std::fmt; // Importar o módulo `fmt`. // Definir uma estrutura nomeada `List` contendo uma `Vec`. struct List(Vec<i32>); impl fmt::Display for List { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { // Extrair o valor usando a indexação de tupla, // e criar uma referência para `vec`. let vec = &self.0; write!(f, "[")?; // Iterar sobre `v` em `vec` enquanto enumera a // contagem da iteração em `count`. for (count, v) in vec.iter().enumerate() { // Para cada elemento exceto o primeiro, adicione uma vírgula. // Usar o operador `?` para retornar em erros. if count != 0 { write!(f, ", ")?; } write!(f, "{}", v)?; } // Fechar o parênteses retos e retornar um valor de `fmt::Result` write!(f, "]") } } fn main() { let v = List(vec![1, 2, 3]); println!("{}", v); }
Atividade
Tente mudar o programa para que o índice de cada elemento no vetor seja também imprimido. A nova saída deveriam parecer-se com isto:
[0: 1, 1: 2, 2: 3]
Consulte também:
for
, ref
, Result
, struct
,
?
, e vec!
Formatação
Nós vimos que a formatação é especificada através de uma sequência de caracteres de formatação:
format!("{}", foo)
->"3735928559"
format!("0x{:X}", foo)
->"0xDEADBEEF"
format!("0o{:o}", foo)
->"0o33653337357"
A mesma variável (foo
) pode ser formatada diferentemente dependendo do qual tipo de argumento é usado: X
vs o
vs não especificado.
Esta funcionalidade de formatação é implementada através das características, e existe uma característica para cada tipo de argumento. A característica de formatação mais comum é Display
, que lida com os casos onde o tipo do argumento é deixado não especificado: {}
por exemplo.
use std::fmt::{self, Formatter, Display}; struct City { name: &'static str, // Latitude lat: f32, // Longitude lon: f32, } impl Display for City { // `f` é um amortecedor, e este método deve escrever a // sequência de caracteres de formatação para ela. fn fmt(&self, f: &mut Formatter) -> fmt::Result { let lat_c = if self.lat >= 0.0 { 'N' } else { 'S' }; let lon_c = if self.lon >= 0.0 { 'E' } else { 'W' }; // `write!` é como `format!`, mas escreverá a sequência de caracteres de // formatação para um amortecedor (o primeiro argumento). write!(f, "{}: {:.3}°{} {:.3}°{}", self.name, self.lat.abs(), lat_c, self.lon.abs(), lon_c) } } #[derive(Debug)] struct Color { red: u8, green: u8, blue: u8, } fn main() { for city in [ City { name: "Dublin", lat: 53.347778, lon: -6.259722 }, City { name: "Oslo", lat: 59.95, lon: 10.75 }, City { name: "Vancouver", lat: 49.25, lon: -123.1 }, ] { println!("{}", city); } for color in [ Color { red: 128, green: 255, blue: 90 }, Color { red: 0, green: 3, blue: 254 }, Color { red: 0, green: 0, blue: 0 }, ] { // Troque isto para usar {} uma vez que tiveres adicionado uma // implementação para `fmt::Display`. println!("{:?}", color); } }
Tu podes ver uma lista completa de características de formatação e seus tipos de argumentos na documentação da std::fmt
.
Atividade
Adicione uma implementação da característica fmt::Display
para a estrutura Color
acima para que a saída exiba algo como:
RGB (128, 255, 90) 0x80FF5A
RGB (0, 3, 254) 0x0003FE
RGB (0, 0, 0) 0x000000
Duas sugestões caso ficares sem saber o que fazer:
- Tu podes precisar de listar cada cor mais de uma vez.
- Tu podes acolchoar com zeros para uma largura de 2 com
:0>2
.
Consulte também:
Primitivos
A Rust fornece acesso à uma grande variedade de primitives
. Um amostra inclui:
Tipos Escalares
- Inteiros com sinais:
- Inteiros sem sinais:
- Ponto flutuante
- Os valores escalares de código universal de
char
como'a'
,'α'
e'∞'
(são 4 bytes cada) bool
ou étrue
ou éfalse
- O tipo de unitário
()
, cujo único valor possível é uma tupla vazia:()
Apesar do valor dum tipo unitário ser uma tupla, não é considerado um tipo composto porque não contém vários valores.
Tipos Compostos
- Arranjos como
[1, 2, 3]
- Tuplas como
(1, true)
As variáveis podem sempre ser anotadas por tipo. Os números podem adicionalmente ser anotados através dum sufixo ou por padrão. Os inteiros predefine para i32
e flutuantes para f64
. Nota que a Rust também pode inferir tipos a partir do contexto:
fn main() { // Variáveis podem ser anotadas por tipo. let logical: bool = true; let an_integer = 5i32; // Suffix annotation Anotação de sufixo // Ou um padrão será usado. let default_float = 3.0; // `f64` let default_integer = 7; // `i32` // Um tipo também pode ser inferido a partir do contexto. let mut inferred_type = 12; // O tipo i64 é inferido a partir da outra linha. inferred_type = 4294967296i64; // Um valor da variável mutável pode ser mudado. let mut mutable = 12; // `i32` mutável mutable = 21; // Erro! O tipo duma variável não pode ser mudado. mutable = true; // As variáveis podem ser sobrescritas com obscurecimento. let mutable = true; }
Consulte também:
a biblioteca std
, mut
, inference
, e
shadowing
Literais e Operadores
Os inteiros 1
, flutuantes 1.2
, caracteres 'a'
, sequências de caracteres "abc"
, booleanos true
e o tipo unitário ()
podem ser expressados usando literais.
Os inteiros podem, alternativamente, ser expressados usando a notação hexadecimal, octal ou binária usando estes prefixos respetivamente: 0x
, 0o
ou 0b
.
Os sublinhados podem ser inseridos nos literais numéricos para melhorar a legibilidade, por exemplo 1_000
é o mesmo que 1000
, e 0.000_001
é o mesmo que 0.000001
.
A Rust suporta a notação cientifica e, por exemplo 1e6
, 7.6e-4
. O tipo associado é f64
.
Nós precisamos dizer ao compilador o tipo dos literais que usamos. Por agora, usaremos o sufixo u32
para indicar que o literal é um inteiro de 32-bit sem sinal, e o sufixo i32
para indicar que é um inteiro de 32-bit com sinal.
Os operadores disponíveis e suas precedências na Rust são semelhantes às outras linguagens parecidas com C:
fn main() { // Adição de inteiro println!("1 + 2 = {}", 1u32 + 2); // Subtração de inteiro println!("1 - 2 = {}", 1i32 - 2); // TODO ^ Tente mudar `1i32` para `1u32` para veres a importância do tipo // Notação cientifica println!("1e4 is {}, -2.5e-3 is {}", 1e4, -2.5e-3); // Lógica booleana de curto-circuito println!("true AND false is {}", true && false); println!("true OR false is {}", true || false); println!("NOT true is {}", !true); // Operações bitwise println!("0011 AND 0101 is {:04b}", 0b0011u32 & 0b0101); println!("0011 OR 0101 is {:04b}", 0b0011u32 | 0b0101); println!("0011 XOR 0101 is {:04b}", 0b0011u32 ^ 0b0101); println!("1 << 5 is {}", 1u32 << 5); println!("0x80 >> 2 is 0x{:x}", 0x80u32 >> 2); // Use sublinhados para melhorar a legibilidade! println!("One million is written as {}", 1_000_000u32); }
Tuplas
Uma tupla é uma coleção de valores de tipos diferentes. As tuplas são construídas usando parênteses ()
e cada tupla por si mesma é um valor com a assinatura de tipo (T1, T2, ...)
, onde T1
, T2
são os tipos dos seus membros. As funções podem usar tuplas para retornarem vários valores, visto que as tuplas podem segurar qualquer número de valores:
// As tuplas podem ser usadas como argumentos de função e retornar valores. fn reverse(pair: (i32, bool)) -> (bool, i32) { // `let` pode ser usado para vincular os membros duma tupla às variáveis. let (int_param, bool_param) = pair; (bool_param, int_param) } // A seguinte estrutura é para a atividade. #[derive(Debug)] struct Matrix(f32, f32, f32, f32); fn main() { // Uma tupla com um grupo de tipos diferentes. let long_tuple = (1u8, 2u16, 3u32, 4u64, -1i8, -2i16, -3i32, -4i64, 0.1f32, 0.2f64, 'a', true); // Os valores podem ser extraídos da tupla usando indexação de tupla. println!("Long tuple first value: {}", long_tuple.0); println!("Long tuple second value: {}", long_tuple.1); // As tuplas podem ser membros de tupla. let tuple_of_tuples = ((1u8, 2u16, 2u32), (4u64, -1i8), -2i16); // As tuplas são imprimíveis. println!("tuple of tuples: {:?}", tuple_of_tuples); // Mas tuplas longas (mas de 12 elementos) não podem ser imprimidas. //let too_long_tuple = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13); //println!("Too long tuple: {:?}", too_long_tuple); // TODO ^ Tente desfazer o comentário das 2 linhas acima // para veres o erro do compilador let pair = (1, true); println!("Pair is {:?}", pair); println!("The reversed pair is {:?}", reverse(pair)); // Para criar tuplas de um elemento, a vírgula é obrigatória para // distingui-las dum literal envolvido por parênteses. println!("One element tuple: {:?}", (5u32,)); println!("Just an integer: {:?}", (5u32)); // As tuplas podem ser desconstruídas para criar vínculos. let tuple = (1, "hello", 4.5, true); let (a, b, c, d) = tuple; println!("{:?}, {:?}, {:?}, {:?}", a, b, c, d); let matrix = Matrix(1.1, 1.2, 2.1, 2.2); println!("{:?}", matrix); }
Atividade
-
Recapitulação: Adicione a característica
fmt::Display
à estruturaMatrix
no exemplo acima, para que se mudares da impressão do formato de depuração{:?}
para o formato de exibição{}
, consigas ver a seguinte saída:( 1.1 1.2 ) ( 2.1 2.2 )
Podemos consultar o exemplo para impressão de exibição.
-
Adicione uma função
transpose
usando a funçãoreverse
como molde, que aceita uma matriz como argumento, e retorna uma matriz na qual dos elementos têm sido trocados. Por exemplo:println!("Matrix:\n{}", matrix); println!("Transpose:\n{}", transpose(matrix));
Resulta na saída:
Matrix: ( 1.1 1.2 ) ( 2.1 2.2 ) Transpose: ( 1.1 2.1 ) ( 1.2 2.2 )
Vetores e Pedaços
Um vetor é uma coleção de objetos do mesmo tipo T
, armazenados na memória contígua. Os vetores são criados usando parênteses retos []
, e o seu comprimento, que é conhecido em tempo de compilação, é parte da sua assinatura de tipo [T; length]
.
Os pedaços são semelhantes aos vetores, mas seu comprimento não é conhecido em tempo de compilação. Ao invés disto, um pedaço é um objeto de duas palavras; a primeira palavra é um ponteiro para o dado, e a segunda palavra o comprimento do pedaço. O tamanho da palavra é o mesmo que não ter tamanho usize
, determinado pela arquitetura do processador, por exemplo, 64 bits numa x86-64. Os pedaços podem ser usados para pedir emprestado uma seção dum vetor e têm a assinatura de tipo &[T]
:
use std::mem; // Esta função pedi emprestado um pedaço. fn analyze_slice(slice: &[i32]) { println!("First element of the slice: {}", slice[0]); println!("The slice has {} elements", slice.len()); } fn main() { // Vetor de tamanho fixo (assinatura de tipo é supérflua). let xs: [i32; 5] = [1, 2, 3, 4, 5]; // Todos os elementos podem ser inicializados para o mesmo valor. let ys: [i32; 500] = [0; 500]; // A indexação começa em 0. println!("First element of the array: {}", xs[0]); println!("Second element of the array: {}", xs[1]); // `len` retorna a contagem de elementos no vetor. println!("Number of elements in array: {}", xs.len()); // Os vetores são pilhas alocadas. println!("Array occupies {} bytes", mem::size_of_val(&xs)); // Os vetores podem ser pedidos emprestado como pedaços. println!("Borrow the whole array as a slice."); analyze_slice(&xs); // Os pedaços podem apontar para uma seção dum vetor. // Eles são da forma [starting_index..ending_index]. // `starting_index` é a primeira posição no pedaço. // `ending_index` é a última posição no pedaço. println!("Borrow a section of the array as a slice."); analyze_slice(&ys[1 .. 4]); // Exemplo de pedaço vazio `&[]`: let empty_array: [u32; 0] = []; assert_eq!(&empty_array, &[]); assert_eq!(&empty_array, &[][..]); // O mesmo mas mais verboso // Os vetores podem ser acessados seguramente usando `.get`, que retorna // uma `Option`. Isto pode ser correspondido como mostrado abaixo, ou usado // com `.expect()` se gostarias que o programa saísse com uma mensagem // agradável ao invés de felizmente continuar. for i in 0..xs.len() + 1 { // Oops, um elemento longe de mais! match xs.get(i) { Some(xval) => println!("{}: {}", i, xval), None => println!("Slow down! {} is too far!", i), } } // Indexação fora do limite no vetor causa erro de tempo de compilação. //println!("{}", xs[5]); // // Indexação fora do limite no pedaço causa erro de tempo de execução. //println!("{}", xs[..][5]); }
Tipos Personalizados
Os tipos de dados personalizados da Rust são principalmente formados através de duas palavras-chave:
struct
: define uma estruturaenum
: define uma enumeração
As constantes também podem ser criadas através das palavras-chave const
e static
.
Estruturas
Existem três tipos de estruturas ("structs") que podem ser criadas usando a palavra-chave struct
:
- Estruturas de tupla, que são, basicamente, tuplas nomeadas.
- A estruturas de C clássicas,
- Estruturas unitárias, que são sem campo, são úteis para genéricos.
// Um atributo para esconder avisos de código não usado. #![allow(dead_code)] #[derive(Debug)] struct Person { name: String, age: u8, } // Uma estrutura unitária struct Unit; // Um estrutura de tupla struct Pair(i32, f32); // Uma estrutura com dois campos struct Point { x: f32, y: f32, } // As estruturas podem ser reutilizadas como campos duma outra estrutura struct Rectangle { // Um retângulo pode ser especificado pela posição dos cantos // superior esquerdo e inferior direito no espaço. top_left: Point, bottom_right: Point, } fn main() { // Criar a estrutura com a abreviatura de inicialização de campo let name = String::from("Peter"); let age = 27; let peter = Person { name, age }; // Imprimir em depuração a estrutura println!("{:?}", peter); // Instanciar um `Point` let point: Point = Point { x: 10.3, y: 0.4 }; // Acessar os campos do ponto println!("point coordinates: ({}, {})", point.x, point.y); // Criar um novo ponto usando a sintaxe de atualização de estrutura // para usar os campos do nosso outro let bottom_right = Point { x: 5.2, ..point }; // `bottom_right.y` será o mesmo que `point.y` porque usamos este campo // a partir de `point` println!("second point: ({}, {})", bottom_right.x, bottom_right.y); // Desestruturar o ponto usando uma vínculo `let` let Point { x: left_edge, y: top_edge } = point; let _rectangle = Rectangle { // A instanciação da estrutura também é uma expressão top_left: Point { x: left_edge, y: top_edge }, bottom_right: bottom_right, }; // Instanciar uma estrutura unitária let _unit = Unit; // Instanciar uma estrutura de tupla let pair = Pair(1, 0.1); // Acessar os campos duma estrutura de tupla println!("pair contains {:?} and {:?}", pair.0, pair.1); // Desestruturar uma estrutura de tupla let Pair(integer, decimal) = pair; println!("pair contains {:?} and {:?}", integer, decimal); }
Atividade
- Adicione uma função
rect_area
que calcula a área dumRectangle
(tente usar desestruturação encaixada). - Adicione uma função
square
que recebe umPoint
e umf32
como argumentos, e retorna umRectangle
com o seu canto superior esquerdo no ponto, e uma largura e altura correspondendo aof32
.
Consulte também
Enumerações
A palavra-chave enum
permite a criação dum tipo que pode ser um das poucas variantes diferentes. Qualquer que é válido como uma struct
é também válido numa enum
.
// Criar uma `enum` para classificar um evento de Web. // Nota como nomes e o tipo da informação juntos especificam a variante: // `PageLoad != PageUnload` e `KeyPress(char) != Paste(String)`. // Cada um é diferente e independente. enum WebEvent { // Uma variante `enum` pode ser qualquer`unit-like`, PageLoad, PageUnload, // como estruturas de tupla, KeyPress(char), Paste(String), // ou estruturas parecidas de C. Click { x: i64, y: i64 }, } // Uma função que recebe uma enumeração `WebEvent` como // um argumento e não retorna nada. fn inspect(event: WebEvent) { match event { WebEvent::PageLoad => println!("page loaded"), WebEvent::PageUnload => println!("page unloaded"), // Desestruturar `c` a partir duma variante `enum`. WebEvent::KeyPress(c) => println!("pressed '{}'.", c), WebEvent::Paste(s) => println!("pasted \"{}\".", s), // Desestruturar `Click` para `x` e `y`. WebEvent::Click { x, y } => { println!("clicked at x={}, y={}.", x, y); }, } } fn main() { let pressed = WebEvent::KeyPress('x'); // `to_owned()` cria uma `String` própria a partir // dum pedaço de sequência de caracteres. let pasted = WebEvent::Paste("my text".to_owned()); let click = WebEvent::Click { x: 20, y: 80 }; let load = WebEvent::PageLoad; let unload = WebEvent::PageUnload; inspect(pressed); inspect(pasted); inspect(click); inspect(load); inspect(unload); }
Pseudónimos de Tipo
Se usas um pseudónimo de tipo, podes fazer referência a cada variante de enumeração através do pseudónimo. Isto pode ser útil se o nome da enumeração for muito longo ou muito genérico, e queres renomeá-lo.
enum VeryVerboseEnumOfThingsToDoWithNumbers { Add, Subtract, } // Creates a type alias // Cria um pseudónimo de tipo type Operations = VeryVerboseEnumOfThingsToDoWithNumbers; fn main() { // Nós podemos fazer referência a cada variante através do pseudónimo, // e não seu nome longo e inconveniente. let x = Operations::Add; }
O lugar mais comum que verás isto é em blocos impl
usando o pseudónimo Self
.
enum VeryVerboseEnumOfThingsToDoWithNumbers { Add, Subtract, } impl VeryVerboseEnumOfThingsToDoWithNumbers { fn run(&self, x: i32, y: i32) -> i32 { match self { Self::Add => x + y, Self::Subtract => x - y, } } }
Para saberes mais sobre as enumerações e pseudónimos de tipo, podes ler o relatório de estabilização a partir de quando esta funcionalidade foi estabilizada para Rust.
Consulte também:
match
, fn
, e String
, RFC dos "Pseudónimos de tipo e variantes de enumeração" RFC
use
A declaração use
pode ser usado então a delimitação do âmbito não é necessária:
// Um atributo para esconder avisos de código não usado. #![allow(dead_code)] enum Status { Rich, Poor, } enum Work { Civilian, Soldier, } fn main() { // Usar explicitamente cada nome assim estão disponíveis // sem a delimitação manual. use crate::Status::{Poor, Rich}; // Usar automaticamente cada nome dentro de `Work`. use crate::Work::*; // Equivalente ao `Status::Poor`. let status = Poor; // Equivalente ao `Work::Civilian`. let work = Civilian; match status { // Nota a falta de delimitação de âmbito por causa do // uso explícito acima. Rich => println!("The rich have lots of money!"), Poor => println!("The poor have no money..."), } match work { // Nota novamente a falte de delimitação do âmbito. Civilian => println!("Civilians work!"), Soldier => println!("Soldiers fight!"), } }
Consulte também:
Parecida com C
enum
também pode ser usado como enumerações parecidas com C.
// Um atributo para esconder os avisos de código não usado. #![allow(dead_code)] // enumeração com discriminador implícito (começa em 0) enum Number { Zero, One, Two, } // enumeração com discriminador explícito enum Color { Red = 0xff0000, Green = 0x00ff00, Blue = 0x0000ff, } fn main() { // `enums` podem ser moldados como inteiros. println!("zero is {}", Number::Zero as i32); println!("one is {}", Number::One as i32); println!("roses are #{:06x}", Color::Red as i32); println!("violets are #{:06x}", Color::Blue as i32); }
Consulte também:
Caso de Teste: Lista Ligada
Um maneira comum de implementar uma lista ligada é através de enums
:
use crate::List::*; enum List { // Cons: estrutura de tipo que envolve um elemento e // um ponteiro para o próximo nó. Cons(u32, Box<List>), // Nil: Um nó que significa o fim da lista ligada Nil, } // Os métodos podem ser anexados à numa enumeração impl List { // Criar uma lista vazia fn new() -> List { // `Nil` tem o tipo `List` Nil } // Consumir uma lista, e retornar a mesma lista com // um novo elemento na sua frente fn prepend(self, elem: u32) -> List { // `Cons` também tem o tipo `List` Cons(elem, Box::new(self)) } // Retornar o comprimento da lista fn len(&self) -> u32 { // `self` precisa ser correspondido, porque o comportamento deste método // depende da variante do `self`. // `self` tem o tipo `&List`, e `*self` tem o tipo `List`, // correspondência dum tipo concreto `T` é preferida sobre // uma correspondência numa referência `&T`, depois da Rust 2018 // podes usar `self` aqui e também perseguir (sem referência) abaixo, // Rust inferirá `&s` e referenciará a cauda. // See https://doc.rust-lang.org/edition-guide/rust-2018/ownership-and-lifetimes/default-match-bindings.html match *self { // Não possível tomar posse da cauda, porque `self` é emprestado; // ao invés disto recebe uma referência à cauda Cons(_, ref tail) => 1 + tail.len(), // Caso de Base: Uma lista vaziam tem comprimento zero Nil => 0 } } // Retornar a representação da lista como uma // sequência de caracteres (monte alocado) fn stringify(&self) -> String { match *self { Cons(head, ref tail) => { // `format!` é semelhante à `print!`, mas retorna uma // sequência de caracteres de monte alocado de impressão // para a consola. format!("{}, {}", head, tail.stringify()) }, Nil => { format!("Nil") }, } } } fn main() { // Criar uma lista ligada vazia let mut list = List::new(); // Adicionar ao começo da lista alguns elementos list = list.prepend(1); list = list.prepend(2); list = list.prepend(3); // Mostrar o estado final da lista println!("linked list has length: {}", list.len()); println!("{}", list.stringify()); }
Consulte também:
Constantes
A Rust tem dois tipos diferentes de constantes que podem ser declarados em qualquer âmbito incluindo global. Ambos exigem anotação de tipo explícita:
const
: Um valor não modificável (o caso comum).static
: Uma variável possivelmente mutável com a vida útil de'static
.- A vida útil estática é inferida e não precisa de ser especificada.
- O acesso ou modificação duma variável estática mutável é
unsafe
.
// As globais são declaradas fora de todos os outros âmbitos. static LANGUAGE: &str = "Rust"; const THRESHOLD: i32 = 10; fn is_big(n: i32) -> bool { // Acessar constante em alguma função n > THRESHOLD } fn main() { let n = 16; // Acessar constante na linha principal println!("This is {}", LANGUAGE); println!("The threshold is {}", THRESHOLD); println!("{} is {}", n, if is_big(n) { "big" } else { "small" }); // Erro! Não possível modificar uma `const`. THRESHOLD = 5; // FIXME ^ Comente esta linha }
Consulte também:
O RFC de const
/static
, vida útil da 'static
.
Vínculos de Variável
A Rust fornece segurança de tipo através da definição de tipos estáticos. Os vínculos de variável pode ter o seu tipo anotado quando declarados. No entanto, na maioria dos casos, o compilador será capaz de inferir o tipo da variável a partir do contexto, reduzindo grandemente o fardo da anotação.
Os valores (tais como literais) podem ser vinculados às variáveis, usando o vínculo de let
.
fn main() { let an_integer = 1u32; let a_boolean = true; let unit = (); // copiar `an_integer` para `copied_integer` let copied_integer = an_integer; println!("An integer: {:?}", copied_integer); println!("A boolean: {:?}", a_boolean); println!("Meet the unit value: {:?}", unit); // O compilador avisa sobre vínculos de variáveis não usados; // estes avisos podem ser silenciados prefixando o nome da variável // com um sublinhado. let _unused_variable = 3u32; let noisy_unused_variable = 2u32; // FIXME ^ Prefixe com um sublinhado para suprimir o aviso // Nota que os avisos podem não ser exibidos num navegador
Mutabilidade
Os vínculos de variável são imutáveis por padrão, mas isto pode ser sobreposto usando o modificador mut
.
fn main() { let _immutable_binding = 1; let mut mutable_binding = 1; println!("Before mutation: {}", mutable_binding); // Ok mutable_binding += 1; println!("After mutation: {}", mutable_binding); // Erro! Não é possível atribuir um novo valor à uma variável imutável _immutable_binding += 1; }
O compilador lançará um diagnóstico detalhado sobre os erros de mutabilidade.
Âmbito e Obscurecimento
Os vínculos de variável têm um âmbito, e são obrigados a viver num bloco. Um bloco é uma coleção de declarações fechadas por chavetas {}
.
fn main() { // Este vínculo vive na função principal let long_lived_binding = 1; // Isto é um bloco, e tem um âmbito mais pequeno do que a função principal { // Este vínculo apenas existe neste bloco let short_lived_binding = 2; println!("inner short: {}", short_lived_binding); } // Fim do bloco // Erro! `short_lived_binding` não existe neste âmbito println!("outer short: {}", short_lived_binding); // FIXME ^ Comente esta linha println!("outer long: {}", long_lived_binding); }
Além disto, obscurecimento de variável é permitido.
fn main() { let shadowed_binding = 1; { println!("before being shadowed: {}", shadowed_binding); // Este vínculo *obscurece* aquela que está no exterior let shadowed_binding = "abc"; println!("shadowed in inner block: {}", shadowed_binding); } println!("outside inner block: {}", shadowed_binding); // Este vínculo *obscurece* o vínculo anterior let shadowed_binding = 2; println!("shadowed in outer block: {}", shadowed_binding); }
Declarar primeiro
É possível declarar vínculos de variável primeiro, e inicializa-los depois. No entanto, esta forma é raramente usada, visto que pode conduzir à uso de variáveis não inicializadas.
fn main() { // Declarar um vínculo de variável let a_binding; { let x = 2; // Inicializar o vínculo a_binding = x * x; } println!("a binding: {}", a_binding); let another_binding; // Erro! Uso de vínculo não inicializado println!("another binding: {}", another_binding); // FIXME ^ Comenta esta linha another_binding = 1; println!("another binding: {}", another_binding); }
O compilador proíbe o uso de variáveis não inicializadas, visto que isto conduziria à comportamento não definido.
Congelação
Quando o dado estiver vinculado ao mesmo nome imutavelmente, ele também congela. O dado congelado não pode ser modificado até o vínculo imutável sair do âmbito:
fn main() { let mut _mutable_integer = 7i32; { // Obscurecimento pela `_mutable_integer` imutável let _mutable_integer = _mutable_integer; // Erro! `_mutable_integer` está congelado neste âmbito _mutable_integer = 50; // FIXME ^ Comente esta linha // `_mutable_integer` sai fora do âmbito } // Ok! `_mutable_integer` não está congelado neste âmbito _mutable_integer = 3; }
Tipos
A Rust fornece vários mecanismos para mudar ou definir o tipo de tipos primitivos e definidos pelo utilizador. As seguintes seções cobrem:
- Moldagem entre tipos primitivos
- Especificação do tipo de literal desejado
- Uso de inferência de tipo
- Definição de pseudónimos de tipos
Moldagem
A Rust não fornece nenhuma conversão (coerção) de tipo implícita entre tipos primitivos. Mas, a conversão (moldagem) de tipo explícita pode ser realizada usando a palavra-chave as
.
As regras para converter entre tipos integrais segue geralmente as convenções da C, exceto em casos onde a C tem comportamento não definido. O comportamento de todos os moldes entre tipos integrais é bem definido na Rust.
// Suprimir todos os avisos dos moldes que transbordam. #![allow(overflowing_literals)] fn main() { let decimal = 65.4321_f32; // Error! Sem conversão implícita let integer: u8 = decimal; // FIXME ^ Comente esta linha // Conversão explícita let integer = decimal as u8; let character = integer as char; // Error! Existem limitações nas regras de conversão. // Um flutuante não pode ser diretamente convertido à um carácter. let character = decimal as char; // FIXME ^ Comente esta linha println!("Casting: {} -> {} -> {}", decimal, integer, character); // quando moldamos qualquer valor à um tipo sem sinal, T, // T::MAX + 1 é adicionado ou subtraído até o valor caber num novo tipo // 1000 já cabe num u16 println!("1000 as a u16 is: {}", 1000 as u16); // 1000 - 256 - 256 - 256 = 232 // Nos bastidores, o primeiro os bits menos significativos de 8 são mantidos, // enquanto o resto perto do bit mais significativo é truncado. println!("1000 as a u8 is : {}", 1000 as u8); // -1 + 256 = 255 println!(" -1 as a u8 is : {}", (-1i8) as u8); // Para os números positivos, isto é o mesmo que modulus println!("1000 mod 256 is : {}", 1000 % 256); // Quando moldamos à um tipo com sinal, o resultado (bitwise) é o mesmo que // a primeira moldagem ao tipo sem sinal correspondente. // Se o bit mais significativo deste valor for 1, então o valor é negativo. // A menos que já caiba, claro. println!(" 128 as a i16 is: {}", 128 as i16); // 128 as u8 -> 128, // cujo valor na representação de complemento dos dois de 8-bit é: println!(" 128 as a i8 is : {}", 128 as i8); // repetindo o exemplo acima // 1000 as u8 -> 232 println!("1000 as a u8 is : {}", 1000 as u8); // e o valor de 232 na representação de complemento dos dois de 8-bit é -24 println!(" 232 as a i8 is : {}", 232 as i8); // Desde a Rust 1.45, a palavra-chave `as` realiza um *molde de saturação* // quando moldamos de flutuante à inteiro. Se o valor de ponto flutuante // excede o saldo máximo ou é menos do que o saldo mínimo, o valor // retornado será igual ao salto cruzado. // 300.0 as u8 é 255 println!(" 300.0 as u8 is : {}", 300.0_f32 as u8); // -100.0 as u8 é 0 println!("-100.0 as u8 is : {}", -100.0_f32 as u8); // nan as u8 é 0 println!(" nan as u8 is : {}", f32::NAN as u8); // Este comportamento incorre num pequeno custo de execução e // pode ser evitado com métodos inseguros, no entanto o resultado pode // transbordar e retornar **valores incertos**. Use estes métodos sabiamente: unsafe { // 300.0 as u8 é 44 println!(" 300.0 as u8 is : {}", 300.0_f32.to_int_unchecked::<u8>()); // -100.0 as u8 é 156 println!("-100.0 as u8 is : {}", (-100.0_f32).to_int_unchecked::<u8>()); // nan as u8 é 0 println!(" nan as u8 is : {}", f32::NAN.to_int_unchecked::<u8>()); } }
Literais
Os literais numéricos podem ser anotados por tipo adicionado o tipo como um sufixo. Como um exemplo, para especificar que o literal 42
deveria ter o tipo i32
, escreva 42i32
.
O tipo dos literais numéricos sem sufixo depende de como são usados. Se não existir nenhuma restrição, o compilador usará i32
para os inteiros, e f64
para os números de ponto flutuante.
fn main() { // Literais com sufixo, seus tipos são conhecidos na inicialização let x = 1u8; let y = 2u32; let z = 3f32; // Literais sem sufixo, seus tipos dependem de como são usados let i = 1; let f = 1.0; // `size_of_val` retorna o tamanho duma variável em bytes println!("size of `x` in bytes: {}", std::mem::size_of_val(&x)); println!("size of `y` in bytes: {}", std::mem::size_of_val(&y)); println!("size of `z` in bytes: {}", std::mem::size_of_val(&z)); println!("size of `i` in bytes: {}", std::mem::size_of_val(&i)); println!("size of `f` in bytes: {}", std::mem::size_of_val(&f)); }
Existem alguns conceitos usados no código anterior que ainda não foram explicados, cá está uma breve explicação para os leitores impacientes:
std::mem::size_of_val
é uma função, mas chamada com o seu caminho completo. O código pode ser separado em unidades lógicas chamadas de módulos. Neste caso, a funçãosize_of_val
é definida no módulomem
, e o módulomem
é definida no caixotestd
. Para mais detalhes, consulte os módulos e caixotes.
Inferência
O motor da inferência de tipo é muito inteligente. Ele faz mais do que olhar o tipo da expressão do valor durante uma inicialização. Ele também olha em como a variável é usado mais tarde para inferir o seu tipo. Cá está um exemplo avançado da inferência de tipo:
fn main() { // Por causa da anotação, o compilador sabe que `elem` tem o tipo u8. let elem = 5u8; // Criar um vetor vazio (uma tabela que cresce). let mut vec = Vec::new(); // Neste momento o compilador não sabe o tipo exato de `vec`, // apenas sabe que é um vetor de alguma coisa (`Vec<_>`). // Inserir `elem` no vetor. vec.push(elem); // Aha! Agora o compilador sabe que `vec` é um vetor de `u8` (`Vec<u8>`) // TODO ^ Tente comentar a linha `vec.push(elem)` println!("{:?}", vec); }
Nenhuma anotação de tipo de variáveis foi necessária, o compilador está feliz e consequentemente o programador!
Pseudónimos
A declaração type
pode ser usada para dar um novo nome à um tipo existente. Os tipos devem ter nomes com o seguinte padrão UpperCamelCase
, ou o compilador levantará um aviso. As exceções à esta regra são os tipos primitivos: usize
, f32
, etc.
// `NanoSecond`, `Inch`, e `U64` são novos nomes para `u64`. type NanoSecond = u64; type Inch = u64; type U64 = u64; fn main() { // `NanoSecond` = `Inch` = `U64` = `u64`. let nanoseconds: NanoSecond = 5 as U64; let inches: Inch = 2 as U64; // Nota que os pseudónimos de tipo *não* fornecem qualquer segurança de // tipo adicional, porque os pseudónimos *não* são novos tipos println!("{} nanoseconds + {} inches = {} unit?", nanoseconds, inches, nanoseconds + inches); }
O uso principal de pseudónimos é reduzir a complexidade; por exemplo o tipo io::Result<T>
é um pseudónimo para o tipo Result<T, io::Error>
.
Consulte também:
Conversão
Os tipos primitivos podem ser convertidos à um ao outro através da moldagem.
A Rust aborda a conversão entre tipos personalizados (por exemplo, struct
e enum
) com o uso de características. As conversões genéricas usarão as características From
e Into
. No entanto existem aquelas mais específicas para casos mais comuns, em especial quando convertemos para e de String
.
From
e Into
As características From
e Into
estão inerentemente ligadas, e isto é efetivamente parte da sua implementação. Se fores capaz de converter o tipo A a partir do tipo B, então deveria ser fácil acreditar que deveríamos ser capaz de converter o tipo B para o tipo A.
From
A característica From
permite um tipo definir como criar-se a partir doutro tipo, daí fornecendo um mecanismo muito simples para converter entre vários tipos. Existem numerosas implementações desta característica dentro da biblioteca padrão para conversão de tipos primitivos e comuns.
Por exemplo podemos facilmente converter uma str
numa String
:
#![allow(unused)] fn main() { let my_str = "hello"; let my_string = String::from(my_str); }
Nós podemos fazer algo semelhante para definir uma conversão para o nosso próprio tipo.
use std::convert::From; #[derive(Debug)] struct Number { value: i32, } impl From<i32> for Number { fn from(item: i32) -> Self { Number { value: item } } } fn main() { let num = Number::from(30); println!("My number is {:?}", num); }
Into
A característica Into
é simplesmente o recíproco da característica From
. Isto é, se tiveres implementado a característica From
para o teu tipo, Into
o chamará quando necessário.
O uso da característica Into
normalmente exigirá especificação do tipo para converter conforme o compilador for incapaz de determinar isto na maioria das vezes. No entanto, isto é um pequeno compromisso considerando que recebemos a funcionalidade gratuitamente:
use std::convert::From; #[derive(Debug)] struct Number { value: i32, } impl From<i32> for Number { fn from(item: i32) -> Self { Number { value: item } } } fn main() { let int = 5; // Tente remover a anotação de tipo let num: Number = int.into(); println!("My number is {:?}", num); }
TryFrom
e TryInto
Semelhante ao From
e Into
, TryFrom
e TryInto
são características genéricas para converter entre tipos. Ao contrário de From
e Into
, as características TryFrom
e TryInto
são usadas para conversões falíveis, e como tal, retornam Result
.
use std::convert::TryFrom; use std::convert::TryInto; #[derive(Debug, PartialEq)] struct EvenNumber(i32); impl TryFrom<i32> for EvenNumber { type Error = (); fn try_from(value: i32) -> Result<Self, Self::Error> { if value % 2 == 0 { Ok(EvenNumber(value)) } else { Err(()) } } } fn main() { // TryFrom assert_eq!(EvenNumber::try_from(8), Ok(EvenNumber(8))); assert_eq!(EvenNumber::try_from(5), Err(())); // TryInto let result: Result<EvenNumber, ()> = 8i32.try_into(); assert_eq!(result, Ok(EvenNumber(8))); let result: Result<EvenNumber, ()> = 5i32.try_into(); assert_eq!(result, Err(())); }
Para e a partir de Sequências de Caracteres
Converter para Sequência de Caracteres
Converter qualquer tipo para uma String
é tão simples quando implementar a característica ToString
para o tipo. Ao invés de fazer isto diretamente, deves implementar a característica fmt::Display
que auto-magicamente fornece ToString
e também permite imprimir o tipo conforme discutidos na seção print!
.
use std::fmt; struct Circle { radius: i32 } impl fmt::Display for Circle { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { write!(f, "Circle of radius {}", self.radius) } } fn main() { let circle = Circle { radius: 6 }; println!("{}", circle.to_string()); }
Analisando uma Sequência de Caracteres
Um dos tipos mais comum a converter para uma sequência de caracteres é um número. A abordagem idiomática para isto é usar a função parse
e ou organizar para a inferência de tipo ou especificar o tipo a analisar usando a sintaxe 'turbofish'. Ambas alternativas são mostradas no seguinte exemplo.
Isto converterá a sequência de caracteres para o tipo especificado enquanto a característica FromStr
é implementada para este tipo. Isto é implementado para numerosos tipos dentro da biblioteca padrão. Para obter esta funcionalidade num tipo definido pelo utilizador simplesmente implemente a característica FromStr
para este tipo.
fn main() { let parsed: i32 = "5".parse().unwrap(); let turbo_parsed = "10".parse::<i32>().unwrap(); let sum = parsed + turbo_parsed; println!("Sum: {:?}", sum); }
Expressões
Um programa de Rust é (maioritariamente) composto duma série de declarações e expressões:
fn main() { // declaração // declaração // declaração }
Existem alguns tipos de declarações na Rust. Os dois mais comuns são declaração dum vínculo de variável, e o uso dum ;
com expressão:
fn main() { // vínculo de variável let x = 5; // expressão; x; x + 1; 15; }
Os blocos são também expressões, então podem ser usados como valores em atribuições. A última expressão no bloco serão atribuídos à expressão do local tal como uma variável local. No entanto, se a última expressão do bloco termina com um sinal de ponto e vírgula, o valor de retorno será ()
:
fn main() { let x = 5u32; let y = { let x_squared = x * x; let x_cube = x_squared * x; // Esta expressão será atribuída à `y` x_cube + x_squared + x }; let z = { // O ponto e vírgula suprime esta expressão e `()` é atribuído à `z` 2 * x; }; println!("x is {:?}", x); println!("y is {:?}", y); println!("z is {:?}", z); }
Fluxo de Controlo
Uma parte integral de qualquer linguagem de programação são as maneiras de modificar o fluxo de controlo: if
ou else
, for
e outros. Falaremos sobre os mesmos na Rust.
if
ou else
A ramificação com if
ou else
é semelhante às outras linguagens. Ao contrário delas, a condição booleana não precisa ser envolvida por parênteses, e cada condição é seguida por um bloco. As condicionais if
e else
são expressões, e todos os ramos deve retornar o mesmo tipo.
fn main() { let n = 5; if n < 0 { print!("{} is negative", n); } else if n > 0 { print!("{} is positive", n); } else { print!("{} is zero", n); } let big_n = if n < 10 && n > -10 { println!(", and is a small number, increase ten-fold"); // Esta expressão retorna uma `i32`. 10 * n } else { println!(", and is a big number, halve the number"); // Esta expressão deve retornar também uma `i32`. n / 2 // TODO ^ Tente suprimir esta expressão com um ponto e vírgula. }; // ^ Não esqueça do ponto e vírgula! Todos vínculos de `let` precisam. println!("{} -> {}", n, big_n); }
Laço de Repetição
A Rust fornece uma palavra-chave loop
para indicar um laço infinito.
A declaração break
pode ser usada para sair dum laço de repetição a qualquer momento, ao passo que a declaração continue
pode ser usada para ignorar o resto da iteração e começar uma nova.
fn main() { let mut count = 0u32; println!("Let's count until infinity!"); // Laço de repetição infinita loop { count += 1; if count == 3 { println!("three"); // Ignorar o resto desta iteração continue; } println!("{}", count); if count == 5 { println!("OK, that's enough"); // Sair deste laço de repetição break; } } }
Encaixamentos e Rótulos
É possível rebentar (break
) ou ignorar (continue
) laços de repetição externos quando lidamos com laços de repetição encaixados. Nestes casos, os laços devem ser anotados com o mesmo rótulo ('label
), e o rótulo deve ser passado à declaração break
ou continue
.
#![allow(unreachable_code, unused_labels)] fn main() { 'outer: loop { println!("Entered the outer loop"); 'inner: loop { println!("Entered the inner loop"); // Isto apenas rebentaria o laço de repetição interno //break; // Isto rebenta o laço de repetição externo break 'outer; } println!("This point will never be reached"); } println!("Exited the outer loop"); }
Regressar dos Laços de Repetições
Um dos usos dum loop
é para voltar a tentar uma operação até ser bem-sucedida. Mas se a operação retornar um valor, podes precisar de passá-lo ao resto do código: colocá-lo depois da break
, e será retornado pela expressão loop
.
fn main() { let mut counter = 0; let result = loop { counter += 1; if counter == 10 { break counter * 2; } }; assert_eq!(result, 20); }
while
A palavra-chave while
pode ser usada para executar um laço de repetição enquanto (while
) uma condição for verdadeira.
Escreveremos o infame FizzBuzz usando um laço de repetição while
.
fn main() { // Uma variável contadora let mut n = 1; // Laço de repetição enquanto `n` for menor do que 101 while n < 101 { if n % 15 == 0 { println!("fizzbuzz"); } else if n % 3 == 0 { println!("fizz"); } else if n % 5 == 0 { println!("buzz"); } else { println!("{}", n); } // Incrementar contador n += 1; } }
Laços de Repetição for
for
e o limite
A construção for in
pode ser usada para iterar através dum Iterator
. Uma das maneiras mais fáceis de criar um iterador é usar a notação de limite a..b
. Isto produz valores a partir de a
(inclusivo) à b
(exclusivo) em passos de um.
Vamos escrever FizzBuzz usando for
ao invés de while
.
fn main() { // `n` receberá os valores: 1, 2, ..., 100 em cada iteração for n in 1..101 { if n % 15 == 0 { println!("fizzbuzz"); } else if n % 3 == 0 { println!("fizz"); } else if n % 5 == 0 { println!("buzz"); } else { println!("{}", n); } } }
Alternativamente, a..=b
pode ser usado para um limite que é inclusivo em ambos fins. O exemplo acima pode ser escrito como:
fn main() { // `n` receberá os valores: 1, 2, ..., 100 em cada iteração for n in 1..=100 { if n % 15 == 0 { println!("fizzbuzz"); } else if n % 3 == 0 { println!("fizz"); } else if n % 5 == 0 { println!("buzz"); } else { println!("{}", n); } } }
for
e os iteradores
A construção for in
é capaz de interagir com um Iterator
de muitas maneiras. Conforme discutido na seção sobre a característica Iterator
, por padrão o laço for
aplicará a função into_iter
à coleção. No entanto, isto não é o único significado de converter coleções em iteradores.
into_iter
, iter
e iter_mut
manipulam a conversão duma coleção num iterador de maneiras diferentes, fornecendo visões diferentes sobre os dados no interior.
iter
- Este pedi emprestado cada elemento da coleção durante cada iteração. Assim deixando a coleção intocada e disponível para reuso depois do laço.
fn main() { let names = vec!["Bob", "Frank", "Ferris"]; for name in names.iter() { match name { &"Ferris" => println!("There is a rustacean among us!"), // TODO ^ Tente eliminar o & e corresponder apenas "Ferris" _ => println!("Hello {}", name), } } println!("names: {:?}", names); }
into_iter
- Este consume a coleção para que em cada iteração o dado exato ser fornecido. Assim que a coleção tiver sido consumida não estará disponível para reuso como se tivesse sido 'movida' dentro do laço.
fn main() { let names = vec!["Bob", "Frank", "Ferris"]; for name in names.into_iter() { match name { "Ferris" => println!("There is a rustacean among us!"), _ => println!("Hello {}", name), } } println!("names: {:?}", names); // FIXME ^ Comente esta linha }
iter_mut
- Este pedi emprestado de maneira mutável cada elemento da coleção, permitindo a coleção ser modificada no lugar.
fn main() { let mut names = vec!["Bob", "Frank", "Ferris"]; for name in names.iter_mut() { *name = match name { &mut "Ferris" => "There is a rustacean among us!", _ => "Hello", } } println!("names: {:?}", names); }
Nos trechos acima nota o tipo do ramo match
, que é a diferença chave nos tipos de iteração. A diferença no tipo portanto com certeza implica que diferentes ações que são capazes de ser realizadas.
Consulte também:
match
A Rust fornece correspondência de padrão através da palavra-chave match
, que pode ser usada como uma switch
de C. O primeiro braço de correspondência é avaliado e todos os valores possíveis devem ser cobertos.
fn main() { let number = 13; // TODO ^ Tente valores diferentes para `number` println!("Tell me about {}", number); match number { // Corresponder um único valor 1 => println!("One!"), // Corresponder vários valores 2 | 3 | 5 | 7 | 11 => println!("This is a prime"), // TODO ^ Tente adicionar 13 à lista de valores primos // Corresponder um limite inclusivo 13..=19 => println!("A teen"), // Lidar com o resto dos casos _ => println!("Ain't special"), // TODO ^ Tente comentar este braço de captura total } let boolean = true; // `match` também é uma expressão let binary = match boolean { // Os braços duma correspondência deve cobrir todos os valores possíveis false => 0, true => 1, // TODO ^ Tente comentar um destes braços }; println!("{} -> {}", boolean, binary); }
Desestruturação
Um bloco match
pode desestruturar itens numa variedade de maneiras.
- Desestruturação de Tuplas
- Desestruturação de Vetores e Pedaços
- Desestruturação de Enumerações
- Desestruturação de Ponteiros
- Desestruturação de Estruturas
Tuplas
As tuplas podem ser desestruturas num match
da seguinte maneira:
fn main() { let triple = (0, -2, 3); // TODO ^ Experimente valores diferentes para `triple` println!("Tell me about {:?}", triple); // A correspondência pode ser usada para desestruturar uma tupla match triple { // Desestruturar o segundo e o terceiro elemento (0, y, z) => println!("First is `0`, `y` is {:?}, and `z` is {:?}", y, z), (1, ..) => println!("First is `1` and the rest doesn't matter"), (.., 2) => println!("last is `2` and the rest doesn't matter"), (3, .., 4) => println!("First is `3`, last is `4`, and the rest doesn't matter"), // `..` pode ser usado para ignorar o resto da tupla _ => println!("It doesn't matter what they are"), // `_` significa não vincular o valor à uma variável } }
Consulte também:
Vetores ou Pedaços
Tal como as tuplas, os vetores e pedaços podem ser desestruturados desta maneira:
fn main() { // Experimente mudar os valores no vetor, ou torne-o num pedaço! let array = [1, -2, 6]; match array { // Vincula o segundo e o terceiro elemento às respetivas variáveis [0, second, third] => println!("array[0] = 0, array[1] = {}, array[2] = {}", second, third), // Os valores individuais podem ser ignorados com _ [1, _, third] => println!( "array[0] = 1, array[2] = {} and array[1] was ignored", third ), // Nós também podemos vincular alguns e ignorar o resto [-1, second, ..] => println!( "array[0] = -1, array[1] = {} and all the other ones were ignored", second ), // O código abaixo não compilaria // [-1, second] => ... // Ou armazena-os noutro vetor ou pedaço (o tipo depende do // valor que está sendo comparado) [3, second, tail @ ..] => println!( "array[0] = 3, array[1] = {} and the other elements were {:?}", second, tail ), // Combinado estes padrões, podemos, por exemplo, vincular os // primeiros e últimos valores, e armazenar o resto deles num único vetor [first, middle @ .., last] => println!( "array[0] = {}, middle = {:?}, array[2] = {}", first, middle, last ), } }
Consulte também:
Vetores e Pedaços e Vínculo pelo símbolo @
Enumerações
Uma enum
é desestruturada de maneira semelhante:
// `allow` exigido para silenciar os avisos // porque apenas uma variante é usada. #[allow(dead_code)] enum Color { // Estes 3 são especificados unicamente pelo seu nome. Red, Blue, Green, // Estes igualmente atam as tuplas de `u32` para // nomes diferentes: modelos de cor. RGB(u32, u32, u32), HSV(u32, u32, u32), HSL(u32, u32, u32), CMY(u32, u32, u32), CMYK(u32, u32, u32, u32), } fn main() { let color = Color::RGB(122, 17, 40); // TODO ^ Experimente diferentes variantes para `color` println!("What color is it?"); // Uma `enum` pode ser desestruturada usando uma `match`. match color { Color::Red => println!("The color is Red!"), Color::Blue => println!("The color is Blue!"), Color::Green => println!("The color is Green!"), Color::RGB(r, g, b) => println!("Red: {}, green: {}, and blue: {}!", r, g, b), Color::HSV(h, s, v) => println!("Hue: {}, saturation: {}, value: {}!", h, s, v), Color::HSL(h, s, l) => println!("Hue: {}, saturation: {}, lightness: {}!", h, s, l), Color::CMY(c, m, y) => println!("Cyan: {}, magenta: {}, yellow: {}!", c, m, y), Color::CMYK(c, m, y, k) => println!("Cyan: {}, magenta: {}, yellow: {}, key (black): {}!", c, m, y, k), // Não precisa doutro braço porque todas variantes foram examinadas } }
Consulte também:
#[allow(...)]
, modelos de cor e enum
Ponteiros ou Referência
Para os ponteiros, uma distinção precisa de ser feita entre a desestruturação e a destruição de referência, uma vez que são conceitos diferentes que são usados de maneira diferente nas linguagens como C/C++.
- A destruição de referência usa
*
- A desestruturação usa
&
,ref
, eref mut
fn main() { // Atribuir uma referência do tipo `i32`. O `&` significa que // existe uma referência a ser atribuída. let reference = &4; match reference { // Se `reference` for o padrão correspondente contra `&val`, // resulta numa comparação como: // `&i32` // `&val` // ^ Nós vemos que se os `&` correspondentes forem eliminados, // então deve ser atribuído à `val`. &val => println!("Got a value via destructuring: {:?}", val), } // Para evitar o `&`, destruímos a referência antes da correspondência. match *reference { val => println!("Got a value via dereferencing: {:?}", val), } // E se não começarmos com uma referência? `reference` era um `&` // porque o lado direito já era uma referência. Isto não é uma // referência porque o lado direito não é uma. let _not_a_reference = 3; // A Rust fornece `ref` para exatamente este propósito. Ela modifica a // atribuição para que uma referência seja criada para o elemento; // esta referência é atribuída. let ref _is_a_reference = 3; // Assim, ao definir 2 valores sem referência, as referências podem // ser recuperadas através de `ref` e `ref mut`. let value = 5; let mut mut_value = 6; // Usar a palavra-chave `ref` para criar uma referência match value { ref r => println!("Got a reference to a value: {:?}", r), } // Usar `ref mut` de maneira semelhante. match mut_value { ref mut m => { // Temos uma referência. Temos que destruir a sua referência // antes que possamos adicionar qualquer coisa à ela. *m += 10; println!("We added 10. `mut_value`: {:?}", m); }, } }
Consulte também:
Estruturas
De maneira semelhante, uma struct
pode ser desestruturada como mostrada:
fn main() { struct Foo { x: (u32, u32), y: u32, } // Experimente mudar os valores na estrutura para veres o que acontece let foo = Foo { x: (1, 2), y: 3 }; match foo { Foo { x: (1, b), y } => println!("First of x is 1, b = {}, y = {} ", b, y), // nós podemos desestruturar as estruturas e renomear as variáveis, // a ordem não é importante Foo { y: 2, x: i } => println!("y is 2, i = {:?}", i), // e também podemos ignorar algumas variáveis: Foo { y, .. } => println!("y = {}, we don't care about x", y), // isto dará um erro: o padrão não menciona o campo `x` //Foo { y } => println!("y = {}", y), } let faa = Foo { x: (1, 2), y: 3 }; // Nós não precisamos dum bloco de correspondência para // desestruturar as estruturas: let Foo { x : x0, y: y0 } = faa; println!("Outside: x0 = {x0:?}, y0 = {y0}"); }
Consulte também:
Guardas
Uma guarda de match
pode ser adicionada para filtrar o braço.
#[allow(dead_code)] enum Temperature { Celsius(i32), Fahrenheit(i32), } fn main() { let temperature = Temperature::Celsius(35); // ^ TODO experimento diferentes valores para `temperature` match temperature { Temperature::Celsius(t) if t > 30 => println!("{}C is above 30 Celsius", t), // A parte da condição if ^ é uma guarda Temperature::Celsius(t) => println!("{}C is below 30 Celsius", t), Temperature::Fahrenheit(t) if t > 86 => println!("{}F is above 86 Fahrenheit", t), Temperature::Fahrenheit(t) => println!("{}F is below 86 Fahrenheit", t), } }
Nota que o compilador não levará as condições da guarda em consideração quando verificamos se todos os padrões estão cobertos pela expressão de correspondência.
fn main() { let number: u8 = 4; match number { i if i == 0 => println!("Zero"), i if i > 0 => println!("Greater than zero"), // _ => unreachable!("Should never happen."), // TODO ^ remova o comentário para corrigir a compilação } }
Consulte Também:
Vínculo
Acessar indiretamente uma variável torna impossível ramificar e usar esta variável sem re-vincular. match
fornece o símbolo @
para vincular os valores aos nomes:
// Uma função `age` que retorna uma `u32`. fn age() -> u32 { 15 } fn main() { println!("Tell me what type of person you are"); match age() { 0 => println!("I haven't celebrated my first birthday yet"), // Poderia `match` 1 ..= 12 diretamente mas então de qual // idade seria a criança? Ao invés disto, vinculamos ao `n` // para a sequência de 1 ..= 12. Agora a idade pode ser reportada. n @ 1 ..= 12 => println!("I'm a child of age {:?}", n), n @ 13 ..= 19 => println!("I'm a teen of age {:?}", n), // Nada vinculado. Retorna o resultado. n => println!("I'm an old person of age {:?}", n), } }
You can also use binding to "destructure" enum
variants, such as Option
:
Nós também podemos usar o vínculo para "desestruturar" as variantes de enum
, tais como Option
:
fn some_number() -> Option<u32> { Some(42) } fn main() { match some_number() { // Recebeu a variante `Some`, corresponde se seu valor, // vinculado à `n`, é igual à 42. Some(n @ 42) => println!("The Answer: {}!", n), // Corresponde qualquer outro número. Some(n) => println!("Not interesting... {}", n), // Corresponde mais alguma coisa (variante `None`). _ => (), } }
Consulte Também:
if let
Para alguns casos de uso, quando correspondemos enumerações, match
é difícil. Por exemplo:
#![allow(unused)] fn main() { // Tornar `optional` do tipo `Option<i32>` let optional = Some(7); match optional { Some(i) => { println!("This is a really long string and `{:?}`", i); // ^ Precisava de 2 indentações só assim poderíamos // desestruturar `i` a partir da opção. }, _ => {}, // ^ Obrigatório porque `match` é exaustiva. // Não parece como espaço desperdiçado? }; }
if let
é mais clara para este caso de uso e além disso permite várias opções de fracasso a serem especificadas:
fn main() { // Todos têm o tipo `Option<i32>` let number = Some(7); let letter: Option<i32> = None; let emoticon: Option<i32> = None; // A construção `if let` lê-se: "se `let` desestruturar `number` na" // `Some(i)`, avaliar o bloco (`{}`). if let Some(i) = number { println!("Matched {:?}!", i); } // Se precisarmos de especificar uma falha, usamos uma `else`: if let Some(i) = letter { println!("Matched {:?}!", i); } else { // Desestruturação falhada. Mudar para o caso de falha. println!("Didn't match a number. Let's go with a letter!"); } // Fornecer uma condição de falha alterada. let i_like_letters = false; if let Some(i) = emoticon { println!("Matched {:?}!", i); // Desestruturação falhada. Avaliar uma condição `else if` para // ver se o ramo de falha alternada deveria ser considerado: } else if i_like_letters { println!("Didn't match a number. Let's go with a letter!"); } else { // A condição avaliou `false`. Este ramo é o padrão: println!("I don't like letters. Let's go with an emoticon :)!"); } }
De algum modo, if let
pode ser usada para corresponder apenas o valor de enumeração:
// Nossa enumeração de exemplo enum Foo { Bar, Baz, Qux(u32) } fn main() { // Criar variáveis de exemplo let a = Foo::Bar; let b = Foo::Baz; let c = Foo::Qux(100); // Variável `a` corresponde `Foo::Bar` if let Foo::Bar = a { println!("a is foobar"); } // Variável `b` não corresponde `Foo::Bar` // Então isto não imprimirá nada if let Foo::Bar = b { println!("b is foobar"); } // Variável `c` corresponde `Foo::Qux` que tem um valor // Semelhante à `Some()` no exemplo anterior if let Foo::Qux(value) = c { println!("c is {}", value); } // Vínculos também funcionam com `if let` if let Foo::Qux(value @ 100) = c { println!("c is one hundred"); } }
Um outro beneficio é que if let
permite-nos corresponder variantes de enumeração não parametrizadas. Isto é verdadeiro mesmo em casos onde a enumeração não implementa ou deriva PartialEq
. Em tais casos if Foo::Bar == a
falharia ao compilar, porque as instâncias da enumeração não podem ser equiparadas, no entanto if let
continuará a funcionar.
Gostarias dum desafio? Corrija o seguinte exemplo para usar if let
:
// Esta enumeração propositadamente não implemente nem deriva `PartialEq`. // É por isto que a comparação abaixo `Foo::Bar == a` falha. enum Foo {Bar} fn main() { let a = Foo::Bar; // Variável `a` corresponde `Foo::Bar` if Foo::Bar == a { // ^-- isto causa um erro de tempo de compilação. Use `if let`. println!("a is foobar"); } }
Consulte também:
let-else
🛈 estável desde a: rust 1.65
Com let
-else
, um padrão refutável pode corresponder e vincular variáveis no âmbito envolvente como um let
normal, se não diverge (por exemplo, break
, return
, panic!
) quando o padrão não corresponder:
#![allow(unused)] fn main() { use std::str::FromStr; fn get_count_item(s: &str) -> (u64, &str) { let mut it = s.split(' '); let (Some(count_str), Some(item)) = (it.next(), it.next()) else { panic!("Can't segment count item pair: '{s}'"); }; let Ok(count) = u64::from_str(count_str) else { panic!("Can't parse integer: '{count_str}'"); }; (count, item) } assert_eq!(get_count_item("3 chairs"), (3, "chairs")); }
O âmbito de vínculos de nome é a coisa principal que torna isto diferente de match
ou expressões if let
-else
. Nós poderíamos anteriormente aproximar estes padrões com um lamentável bit de repetição e um let
externo:
#![allow(unused)] fn main() { use std::str::FromStr; fn get_count_item(s: &str) -> (u64, &str) { let mut it = s.split(' '); let (count_str, item) = match (it.next(), it.next()) { (Some(count_str), Some(item)) => (count_str, item), _ => panic!("Can't segment count item pair: '{s}'"), }; let count = if let Ok(count) = u64::from_str(count_str) { count } else { panic!("Can't parse integer: '{count_str}'"); }; (count, item) } assert_eq!(get_count_item("3 chairs"), (3, "chairs")); }
Consulte também:
option
, match
, if let
e o RFC de let-else
.
while let
Semelhante ao if let
, while let
pode tornar sequências de match
embaraçosas mais toleráveis. Considere a seguinte sequência que incrementa i
:
#![allow(unused)] fn main() { // Tornar `optional` do tipo `Option<i32>` let mut optional = Some(0); // Tentar repetidamente este teste. loop { match optional { // Se `optional` desestruturar, avalia o bloco. Some(i) => { if i > 9 { println!("Greater than 9, quit!"); optional = None; } else { println!("`i` is `{:?}`. Try again.", i); optional = Some(i + 1); } // ^ Exige 3 indentações! }, // Sair do laço de repetição quando a desestruturação falhar: _ => { break; } // ^ Porquê que isto deveria ser obrigatório? Deve existir um jeito melhor! } } }
O uso de while let
torna esta sequência muito mais agradável:
fn main() { // Tornar `optional` do tipo `Option<i32>` let mut optional = Some(0); // Isto lê-se: "enquanto `let` desestruturar `optional`" à // `Some(i)`, avaliar o bloco (`{}`). Se não `break`. while let Some(i) = optional { if i > 9 { println!("Greater than 9, quit!"); optional = None; } else { println!("`i` is `{:?}`. Try again.", i); optional = Some(i + 1); } // ^ Menos desvio à direita e não exige // manipular explicitamente o caso de falha. } // ^ `if let` tinha cláusulas `else` ou `else if` opcionais adicionais // `while let` não tem estas. }
Consulte também:
Funções
As funções são declaras usando a palavra-chave fn
. Os seus argumentos têm os tipos anotados, tal como as variáveis, e se a função retornar um valor, o tipo de retorno de ser especificado depois duma flecha (ou seta) ->
.
A expressão final numa função será usada como valor de retorno. Alternativamente, a declaração de return
pode ser usada para retornar um valor mais cedo a partir de dentro da função, até mesmo a partir de dentro dos laços de repetição ou declarações de if
.
Vamos reescrever o FizzBuzz usando funções!:
// Diferente de C e C++, // não existe restrição sobre a ordem de definições de função fn main() { // Nós podemos usar esta função, // e defini-la em algum lugar depois fizzbuzz_to(100); } // Função que retorna um valor booleano fn is_divisible_by(lhs: u32, rhs: u32) -> bool { // Caso extremo, retornar prematuramente if rhs == 0 { return false; } // Isto é uma expressão, // a palavra-chave `return` não é necessária lhs % rhs == 0 } // As funções que "não" retornam um valor, // na realidade retornam o tipo unitário `()` fn fizzbuzz(n: u32) -> () { if is_divisible_by(n, 15) { println!("fizzbuzz"); } else if is_divisible_by(n, 3) { println!("fizz"); } else if is_divisible_by(n, 5) { println!("buzz"); } else { println!("{}", n); } } // Quando uma função retornar `()`, // o tipo de retorno pode ser omitido da assinatura fn fizzbuzz_to(n: u32) { for n in 1..=n { fizzbuzz(n); } }
Funções e Métodos Associados
Algumas funções são conectadas à um topo específico. Existem em duas formas: funções associadas, e métodos. As funções associadas são funções que são definidas geralmente sobre um tipo, enquanto os métodos são funções associadas que são chamados sobre uma instância específica dum tipo:
struct Point { x: f64, y: f64, } // Bloco de implementação, todas funções associadas do `Point` // e os métodos são colocados neste bloco impl Point { // Isto é uma "função associada" pois esta função está associada à // um tipo específico, que é, `Point`. // // Funções associadas não precisam ser chamadas com uma instância. // Estas funções são geralmente usadas como construtoras. fn origin() -> Point { Point { x: 0.0, y: 0.0 } } // Uma outra função associada, recebendo dois argumentos: fn new(x: f64, y: f64) -> Point { Point { x: x, y: y } } } struct Rectangle { p1: Point, p2: Point, } impl Rectangle { // Isto é um método // `&self` é açúcar sintático para `self: &Self`, // aonde `Self` é o tipo do objeto chamador. // Neste caso `Self` = `Rectangle` fn area(&self) -> f64 { // `self` dá acesso aos campos da estrutura através do // operador ponto let Point { x: x1, y: y1 } = self.p1; let Point { x: x2, y: y2 } = self.p2; // `abs` é um método de `f64` que retorna o // valor absoluto do chamador ((x1 - x2) * (y1 - y2)).abs() } fn perimeter(&self) -> f64 { let Point { x: x1, y: y1 } = self.p1; let Point { x: x2, y: y2 } = self.p2; 2.0 * ((x1 - x2).abs() + (y1 - y2).abs()) } // Este método exige que objeto chamador seja mutável. // `&mut self` é açúcar para `self: &mut Self` fn translate(&mut self, x: f64, y: f64) { self.p1.x += x; self.p2.x += x; self.p1.y += y; self.p2.y += y; } } // `Pair` possui os recursos: dois amontoados alocados de inteiros struct Pair(Box<i32>, Box<i32>); impl Pair { // Este método "consome" os recursos do objeto chamador // `self` é açúcar para `self: Self` fn destroy(self) { // Desestruturar `self` let Pair(first, second) = self; println!("Destroying Pair({}, {})", first, second); // `first` e `second` saem do âmbito e são liberados } } fn main() { let rectangle = Rectangle { // As funções associadas são chamadas usando dois-pontos duplos p1: Point::origin(), p2: Point::new(3.0, 4.0), }; // Os métodos são chamados usando o operador ponto // Nota que o primeiro argumento `&self` é implicitamente passado, isto é, // `rectangle.perimeter()` === `Rectangle::perimeter(&rectangle)` println!("Rectangle perimeter: {}", rectangle.perimeter()); println!("Rectangle area: {}", rectangle.area()); let mut square = Rectangle { p1: Point::origin(), p2: Point::new(1.0, 1.0), }; // Error! `rectangle` é imutável, mas este método exige um objeto mutável //rectangle.translate(1.0, 0.0); // TODO ^ Tente desfazer o comentário desta linha // Okay! Os objetos mutáveis podem chamar métodos mutáveis square.translate(1.0, 1.0); let pair = Pair(Box::new(1), Box::new(2)); pair.destroy(); // Error! A chama de `destroy` anterior "consumiu" `pair` //pair.destroy(); // TODO ^ Tente desfazer o comentário desta linha }
Closures
Closures are functions that can capture the enclosing environment. For example, a closure that captures the x
variable:
|val| val + x
The syntax and capabilities of closures make them very convenient for on the fly usage. Calling a closure is exactly like calling a function. However, both input and return types can be inferred and input variable names must be specified.
Other characteristics of closures include:
- using
||
instead of()
around input variables. - optional body delimination (
{}
) for a single expression (mandatory otherwise). - the ability to capture the outer environment variables.
fn main() { let outer_var = 42; // A regular function can't refer to variables in the enclosing environment //fn function(i: i32) -> i32 { i + outer_var } // TODO: uncomment the line above and see the compiler error. The compiler // suggests that we define a closure instead. // Closures are anonymous, here we are binding them to references // Annotation is identical to function annotation but is optional // as are the `{}` wrapping the body. These nameless functions // are assigned to appropriately named variables. let closure_annotated = |i: i32| -> i32 { i + outer_var }; let closure_inferred = |i | i + outer_var ; // Call the closures. println!("closure_annotated: {}", closure_annotated(1)); println!("closure_inferred: {}", closure_inferred(1)); // Once closure's type has been inferred, it cannot be inferred again with another type. //println!("cannot reuse closure_inferred with another type: {}", closure_inferred(42i64)); // TODO: uncomment the line above and see the compiler error. // A closure taking no arguments which returns an `i32`. // The return type is inferred. let one = || 1; println!("closure returning one: {}", one()); }
Capturing
Closures are inherently flexible and will do what the functionality requires to make the closure work without annotation. This allows capturing to flexibly adapt to the use case, sometimes moving and sometimes borrowing. Closures can capture variables:
- by reference:
&T
- by mutable reference:
&mut T
- by value:
T
They preferentially capture variables by reference and only go lower when required.
fn main() { use std::mem; let color = String::from("green"); // A closure to print `color` which immediately borrows (`&`) `color` and // stores the borrow and closure in the `print` variable. It will remain // borrowed until `print` is used the last time. // // `println!` only requires arguments by immutable reference so it doesn't // impose anything more restrictive. let print = || println!("`color`: {}", color); // Call the closure using the borrow. print(); // `color` can be borrowed immutably again, because the closure only holds // an immutable reference to `color`. let _reborrow = &color; print(); // A move or reborrow is allowed after the final use of `print` let _color_moved = color; let mut count = 0; // A closure to increment `count` could take either `&mut count` or `count` // but `&mut count` is less restrictive so it takes that. Immediately // borrows `count`. // // A `mut` is required on `inc` because a `&mut` is stored inside. Thus, // calling the closure mutates the closure which requires a `mut`. let mut inc = || { count += 1; println!("`count`: {}", count); }; // Call the closure using a mutable borrow. inc(); // The closure still mutably borrows `count` because it is called later. // An attempt to reborrow will lead to an error. // let _reborrow = &count; // ^ TODO: try uncommenting this line. inc(); // The closure no longer needs to borrow `&mut count`. Therefore, it is // possible to reborrow without an error let _count_reborrowed = &mut count; // A non-copy type. let movable = Box::new(3); // `mem::drop` requires `T` so this must take by value. A copy type // would copy into the closure leaving the original untouched. // A non-copy must move and so `movable` immediately moves into // the closure. let consume = || { println!("`movable`: {:?}", movable); mem::drop(movable); }; // `consume` consumes the variable so this can only be called once. consume(); // consume(); // ^ TODO: Try uncommenting this line. }
Using move
before vertical pipes forces closure to take ownership of captured variables:
fn main() { // `Vec` has non-copy semantics. let haystack = vec![1, 2, 3]; let contains = move |needle| haystack.contains(needle); println!("{}", contains(&1)); println!("{}", contains(&4)); // println!("There're {} elements in vec", haystack.len()); // ^ Uncommenting above line will result in compile-time error // because borrow checker doesn't allow re-using variable after it // has been moved. // Removing `move` from closure's signature will cause closure // to borrow _haystack_ variable immutably, hence _haystack_ is still // available and uncommenting above line will not cause an error. }
See also:
Box
and std::mem::drop
As input parameters
While Rust chooses how to capture variables on the fly mostly without type annotation, this ambiguity is not allowed when writing functions. When taking a closure as an input parameter, the closure's complete type must be annotated using one of a few traits
, and they're determined by what the closure does with captured value. In order of decreasing restriction, they are:
Fn
: the closure uses the captured value by reference (&T
)FnMut
: the closure uses the captured value by mutable reference (&mut T
)FnOnce
: the closure uses the captured value by value (T
)
On a variable-by-variable basis, the compiler will capture variables in the least restrictive manner possible.
For instance, consider a parameter annotated as FnOnce
. This specifies that the closure may capture by &T
, &mut T
, or T
, but the compiler will ultimately choose based on how the captured variables are used in the closure.
This is because if a move is possible, then any type of borrow should also be possible. Note that the reverse is not true. If the parameter is annotated as Fn
, then capturing variables by &mut T
or T
are not allowed. However, &T
is allowed.
In the following example, try swapping the usage of Fn
, FnMut
, and FnOnce
to see what happens:
// A function which takes a closure as an argument and calls it. // <F> denotes that F is a "Generic type parameter" fn apply<F>(f: F) where // The closure takes no input and returns nothing. F: FnOnce() { // ^ TODO: Try changing this to `Fn` or `FnMut`. f(); } // A function which takes a closure and returns an `i32`. fn apply_to_3<F>(f: F) -> i32 where // The closure takes an `i32` and returns an `i32`. F: Fn(i32) -> i32 { f(3) } fn main() { use std::mem; let greeting = "hello"; // A non-copy type. // `to_owned` creates owned data from borrowed one let mut farewell = "goodbye".to_owned(); // Capture 2 variables: `greeting` by reference and // `farewell` by value. let diary = || { // `greeting` is by reference: requires `Fn`. println!("I said {}.", greeting); // Mutation forces `farewell` to be captured by // mutable reference. Now requires `FnMut`. farewell.push_str("!!!"); println!("Then I screamed {}.", farewell); println!("Now I can sleep. zzzzz"); // Manually calling drop forces `farewell` to // be captured by value. Now requires `FnOnce`. mem::drop(farewell); }; // Call the function which applies the closure. apply(diary); // `double` satisfies `apply_to_3`'s trait bound let double = |x| 2 * x; println!("3 doubled: {}", apply_to_3(double)); }
See also:
std::mem::drop
, Fn
, FnMut
, Generics, where and FnOnce
Type anonymity
Closures succinctly capture variables from enclosing scopes. Does this have any consequences? It surely does. Observe how using a closure as a function parameter requires generics, which is necessary because of how they are defined:
#![allow(unused)] fn main() { // `F` must be generic. fn apply<F>(f: F) where F: FnOnce() { f(); } }
When a closure is defined, the compiler implicitly creates a new anonymous structure to store the captured variables inside, meanwhile implementing the functionality via one of the traits
: Fn
, FnMut
, or FnOnce
for this unknown type. This type is assigned to the variable which is stored until calling.
Since this new type is of unknown type, any usage in a function will require generics. However, an unbounded type parameter <T>
would still be ambiguous and not be allowed. Thus, bounding by one of the traits
: Fn
, FnMut
, or FnOnce
(which it implements) is sufficient to specify its type.
// `F` must implement `Fn` for a closure which takes no // inputs and returns nothing - exactly what is required // for `print`. fn apply<F>(f: F) where F: Fn() { f(); } fn main() { let x = 7; // Capture `x` into an anonymous type and implement // `Fn` for it. Store it in `print`. let print = || println!("{}", x); apply(print); }
See also:
A thorough analysis, Fn
, FnMut
, and FnOnce
Input functions
Since closures may be used as arguments, you might wonder if the same can be said about functions. And indeed they can! If you declare a function that takes a closure as parameter, then any function that satisfies the trait bound of that closure can be passed as a parameter.
// Define a function which takes a generic `F` argument // bounded by `Fn`, and calls it fn call_me<F: Fn()>(f: F) { f(); } // Define a wrapper function satisfying the `Fn` bound fn function() { println!("I'm a function!"); } fn main() { // Define a closure satisfying the `Fn` bound let closure = || println!("I'm a closure!"); call_me(closure); call_me(function); }
As an additional note, the Fn
, FnMut
, and FnOnce
traits
dictate how a closure captures variables from the enclosing scope.
See also:
As output parameters
Closures as input parameters are possible, so returning closures as output parameters should also be possible. However, anonymous closure types are, by definition, unknown, so we have to use impl Trait
to return them.
The valid traits for returning a closure are:
Fn
FnMut
FnOnce
Beyond this, the move
keyword must be used, which signals that all captures occur by value. This is required because any captures by reference would be dropped as soon as the function exited, leaving invalid references in the closure.
fn create_fn() -> impl Fn() { let text = "Fn".to_owned(); move || println!("This is a: {}", text) } fn create_fnmut() -> impl FnMut() { let text = "FnMut".to_owned(); move || println!("This is a: {}", text) } fn create_fnonce() -> impl FnOnce() { let text = "FnOnce".to_owned(); move || println!("This is a: {}", text) } fn main() { let fn_plain = create_fn(); let mut fn_mut = create_fnmut(); let fn_once = create_fnonce(); fn_plain(); fn_mut(); fn_once(); }
See also:
Fn
, FnMut
, Generics and impl Trait.
Examples in std
This section contains a few examples of using closures from the std
library.
Iterator::any
Iterator::any
is a function which when passed an iterator, will return true
if any element satisfies the predicate. Otherwise false
. Its signature:
pub trait Iterator {
// The type being iterated over.
type Item;
// `any` takes `&mut self` meaning the caller may be borrowed
// and modified, but not consumed.
fn any<F>(&mut self, f: F) -> bool where
// `FnMut` meaning any captured variable may at most be
// modified, not consumed. `Self::Item` states it takes
// arguments to the closure by value.
F: FnMut(Self::Item) -> bool;
}
fn main() { let vec1 = vec![1, 2, 3]; let vec2 = vec![4, 5, 6]; // `iter()` for vecs yields `&i32`. Destructure to `i32`. println!("2 in vec1: {}", vec1.iter() .any(|&x| x == 2)); // `into_iter()` for vecs yields `i32`. No destructuring required. println!("2 in vec2: {}", vec2.into_iter().any(|x| x == 2)); // `iter()` only borrows `vec1` and its elements, so they can be used again println!("vec1 len: {}", vec1.len()); println!("First element of vec1 is: {}", vec1[0]); // `into_iter()` does move `vec2` and its elements, so they cannot be used again // println!("First element of vec2 is: {}", vec2[0]); // println!("vec2 len: {}", vec2.len()); // TODO: uncomment two lines above and see compiler errors. let array1 = [1, 2, 3]; let array2 = [4, 5, 6]; // `iter()` for arrays yields `&i32`. println!("2 in array1: {}", array1.iter() .any(|&x| x == 2)); // `into_iter()` for arrays yields `i32`. println!("2 in array2: {}", array2.into_iter().any(|x| x == 2)); }
See also:
Searching through iterators
Iterator::find
is a function which iterates over an iterator and searches for the first value which satisfies some condition. If none of the values satisfy the condition, it returns None
. Its signature:
pub trait Iterator {
// The type being iterated over.
type Item;
// `find` takes `&mut self` meaning the caller may be borrowed
// and modified, but not consumed.
fn find<P>(&mut self, predicate: P) -> Option<Self::Item> where
// `FnMut` meaning any captured variable may at most be
// modified, not consumed. `&Self::Item` states it takes
// arguments to the closure by reference.
P: FnMut(&Self::Item) -> bool;
}
fn main() { let vec1 = vec![1, 2, 3]; let vec2 = vec![4, 5, 6]; // `iter()` for vecs yields `&i32`. let mut iter = vec1.iter(); // `into_iter()` for vecs yields `i32`. let mut into_iter = vec2.into_iter(); // `iter()` for vecs yields `&i32`, and we want to reference one of its // items, so we have to destructure `&&i32` to `i32` println!("Find 2 in vec1: {:?}", iter .find(|&&x| x == 2)); // `into_iter()` for vecs yields `i32`, and we want to reference one of // its items, so we have to destructure `&i32` to `i32` println!("Find 2 in vec2: {:?}", into_iter.find(| &x| x == 2)); let array1 = [1, 2, 3]; let array2 = [4, 5, 6]; // `iter()` for arrays yields `&i32` println!("Find 2 in array1: {:?}", array1.iter() .find(|&&x| x == 2)); // `into_iter()` for arrays yields `i32` println!("Find 2 in array2: {:?}", array2.into_iter().find(|&x| x == 2)); }
Iterator::find
gives you a reference to the item. But if you want the index of the item, use Iterator::position
.
fn main() { let vec = vec![1, 9, 3, 3, 13, 2]; // `iter()` for vecs yields `&i32` and `position()` does not take a reference, so // we have to destructure `&i32` to `i32` let index_of_first_even_number = vec.iter().position(|&x| x % 2 == 0); assert_eq!(index_of_first_even_number, Some(5)); // `into_iter()` for vecs yields `i32` and `position()` does not take a reference, so // we do not have to destructure let index_of_first_negative_number = vec.into_iter().position(|x| x < 0); assert_eq!(index_of_first_negative_number, None); }
See also:
std::iter::Iterator::rposition
Funções de Ordem Superior
A Rust fornece Funções de Ordem Superior (HOF, sigla em Inglês). Estas são funções que recebem um ou mais funções e ou produzem uma função mais útil. As funções de ordem superior e iteradores preguiçosos dão à Rust o seu sabor funcional:
fn is_odd(n: u32) -> bool { n % 2 == 1 } fn main() { println!("Find the sum of all the squared odd numbers under 1000"); let upper = 1000; // Abordagem imperativa // Declarar variável acumuladora let mut acc = 0; // Iterar: 0, 1, 2, ... à infinito for n in 0.. { // Fazer a quadratura do número let n_squared = n * n; if n_squared >= upper { // Quebrar o laço se excedido o limite superior break; } else if is_odd(n_squared) { // Acumular o valor, se for ímpar acc += n_squared; } } println!("imperative style: {}", acc); // Abordagem funcional let sum_of_squared_odd_numbers: u32 = (0..).map(|n| n * n) // Todos os números naturais ao quadrado .take_while(|&n_squared| n_squared < upper) // Abaixo do limite superior .filter(|&n_squared| is_odd(n_squared)) // Que são ímpares .sum(); // Some-os println!("functional style: {}", sum_of_squared_odd_numbers); }
Option e Iterator implementam a sua parte clara de funções de ordem superior.
Funções Divergentes
As funções divergentes nunca retornam. Elas são marcadas usando !
, o qual é um tipo vazio.
#![allow(unused)] fn main() { fn foo() -> ! { panic!("This call never returns."); } }
Em oposição à todos os outros tipos, esta não pode ser instanciada, porque o conjunto de todos os possíveis valores que este tipo pode ter é vazio. Nota que, é diferente do tipo ()
, que tem exatamente um valor possível.
Por exemplo, esta função retorna de costume, embora não exista informação no valor de retorno:
fn some_fn() { () } fn main() { let _a: () = some_fn(); println!("This function returns and you can see this line."); }
Em posição à esta função, que nunca retornará o controlo de volta ao chamador.
#![feature(never_type)]
fn main() {
let x: ! = panic!("This call never returns.");
println!("You will never see this line!");
}
Embora isto possa parecer como um conceito abstrato, é de fato muito útil e muitas vezes prático. A principal vantagem deste tipo é que pode ser fundido à qualquer outro e, portanto usado em lugares onde um tipo exato é obrigatório, por exemplo nos ramos de match
. Isto permite-nos escrever código como este:
fn main() { fn sum_odd_numbers(up_to: u32) -> u32 { let mut acc = 0; for i in 0..up_to { // Repara que o tipo de retorno desta expressão de correspondência // de ser u32 por causa do tipo da variável "addition". let addition: u32 = match i%2 == 1 { // A variável "i" é do tipo u32, que está perfeitamente correta. true => i, // Por outro lado, a expressão "continue" não retorna u32, // mas ainda está bem, porque nunca retorna e portanto não viola os // requisitos da expressão de correspondência. false => continue, }; acc += addition; } acc } println!("Sum of odd numbers up to 9 (excluding): {}", sum_odd_numbers(9)); }
Ela também é o tipo de retorno das função que percorrem os laços de repetição para sempre (por exemplo, loop {}
) tal como servidores de rede ou funções que terminam o processo (por exemplo, exit()
).
Módulos
A Rust fornece um sistema de módulo poderoso que pode ser usado para separar o código hierarquicamente em unidades lógicas (módulos), e gerir a visibilidade (público ou privado) entre eles.
Um módulo é uma coleção de itens: funções, estruturas, características, blocos de impl
, e até mesmo outros módulos.
Visibility
By default, the items in a module have private visibility, but this can be overridden with the pub
modifier. Only the public items of a module can be accessed from outside the module scope.
// A module named `my_mod` mod my_mod { // Items in modules default to private visibility. fn private_function() { println!("called `my_mod::private_function()`"); } // Use the `pub` modifier to override default visibility. pub fn function() { println!("called `my_mod::function()`"); } // Items can access other items in the same module, // even when private. pub fn indirect_access() { print!("called `my_mod::indirect_access()`, that\n> "); private_function(); } // Modules can also be nested pub mod nested { pub fn function() { println!("called `my_mod::nested::function()`"); } #[allow(dead_code)] fn private_function() { println!("called `my_mod::nested::private_function()`"); } // Functions declared using `pub(in path)` syntax are only visible // within the given path. `path` must be a parent or ancestor module pub(in crate::my_mod) fn public_function_in_my_mod() { print!("called `my_mod::nested::public_function_in_my_mod()`, that\n> "); public_function_in_nested(); } // Functions declared using `pub(self)` syntax are only visible within // the current module, which is the same as leaving them private pub(self) fn public_function_in_nested() { println!("called `my_mod::nested::public_function_in_nested()`"); } // Functions declared using `pub(super)` syntax are only visible within // the parent module pub(super) fn public_function_in_super_mod() { println!("called `my_mod::nested::public_function_in_super_mod()`"); } } pub fn call_public_function_in_my_mod() { print!("called `my_mod::call_public_function_in_my_mod()`, that\n> "); nested::public_function_in_my_mod(); print!("> "); nested::public_function_in_super_mod(); } // pub(crate) makes functions visible only within the current crate pub(crate) fn public_function_in_crate() { println!("called `my_mod::public_function_in_crate()`"); } // Nested modules follow the same rules for visibility mod private_nested { #[allow(dead_code)] pub fn function() { println!("called `my_mod::private_nested::function()`"); } // Private parent items will still restrict the visibility of a child item, // even if it is declared as visible within a bigger scope. #[allow(dead_code)] pub(crate) fn restricted_function() { println!("called `my_mod::private_nested::restricted_function()`"); } } } fn function() { println!("called `function()`"); } fn main() { // Modules allow disambiguation between items that have the same name. function(); my_mod::function(); // Public items, including those inside nested modules, can be // accessed from outside the parent module. my_mod::indirect_access(); my_mod::nested::function(); my_mod::call_public_function_in_my_mod(); // pub(crate) items can be called from anywhere in the same crate my_mod::public_function_in_crate(); // pub(in path) items can only be called from within the module specified // Error! function `public_function_in_my_mod` is private //my_mod::nested::public_function_in_my_mod(); // TODO ^ Try uncommenting this line // Private items of a module cannot be directly accessed, even if // nested in a public module: // Error! `private_function` is private //my_mod::private_function(); // TODO ^ Try uncommenting this line // Error! `private_function` is private //my_mod::nested::private_function(); // TODO ^ Try uncommenting this line // Error! `private_nested` is a private module //my_mod::private_nested::function(); // TODO ^ Try uncommenting this line // Error! `private_nested` is a private module //my_mod::private_nested::restricted_function(); // TODO ^ Try uncommenting this line }
Struct visibility
Structs have an extra level of visibility with their fields. The visibility defaults to private, and can be overridden with the pub
modifier. This visibility only matters when a struct is accessed from outside the module where it is defined, and has the goal of hiding information (encapsulation).
mod my { // A public struct with a public field of generic type `T` pub struct OpenBox<T> { pub contents: T, } // A public struct with a private field of generic type `T` pub struct ClosedBox<T> { contents: T, } impl<T> ClosedBox<T> { // A public constructor method pub fn new(contents: T) -> ClosedBox<T> { ClosedBox { contents: contents, } } } } fn main() { // Public structs with public fields can be constructed as usual let open_box = my::OpenBox { contents: "public information" }; // and their fields can be normally accessed. println!("The open box contains: {}", open_box.contents); // Public structs with private fields cannot be constructed using field names. // Error! `ClosedBox` has private fields //let closed_box = my::ClosedBox { contents: "classified information" }; // TODO ^ Try uncommenting this line // However, structs with private fields can be created using // public constructors let _closed_box = my::ClosedBox::new("classified information"); // and the private fields of a public struct cannot be accessed. // Error! The `contents` field is private //println!("The closed box contains: {}", _closed_box.contents); // TODO ^ Try uncommenting this line }
See also:
The use
declaration
The use
declaration can be used to bind a full path to a new name, for easier access. It is often used like this:
use crate::deeply::nested::{
my_first_function,
my_second_function,
AndATraitType
};
fn main() {
my_first_function();
}
You can use the as
keyword to bind imports to a different name:
// Bind the `deeply::nested::function` path to `other_function`. use deeply::nested::function as other_function; fn function() { println!("called `function()`"); } mod deeply { pub mod nested { pub fn function() { println!("called `deeply::nested::function()`"); } } } fn main() { // Easier access to `deeply::nested::function` other_function(); println!("Entering block"); { // This is equivalent to `use deeply::nested::function as function`. // This `function()` will shadow the outer one. use crate::deeply::nested::function; // `use` bindings have a local scope. In this case, the // shadowing of `function()` is only in this block. function(); println!("Leaving block"); } function(); }
super
and self
The super
and self
keywords can be used in the path to remove ambiguity when accessing items and to prevent unnecessary hardcoding of paths.
fn function() { println!("called `function()`"); } mod cool { pub fn function() { println!("called `cool::function()`"); } } mod my { fn function() { println!("called `my::function()`"); } mod cool { pub fn function() { println!("called `my::cool::function()`"); } } pub fn indirect_call() { // Let's access all the functions named `function` from this scope! print!("called `my::indirect_call()`, that\n> "); // The `self` keyword refers to the current module scope - in this case `my`. // Calling `self::function()` and calling `function()` directly both give // the same result, because they refer to the same function. self::function(); function(); // We can also use `self` to access another module inside `my`: self::cool::function(); // The `super` keyword refers to the parent scope (outside the `my` module). super::function(); // This will bind to the `cool::function` in the *crate* scope. // In this case the crate scope is the outermost scope. { use crate::cool::function as root_function; root_function(); } } } fn main() { my::indirect_call(); }
File hierarchy
Modules can be mapped to a file/directory hierarchy. Let's break down the visibility example in files:
$ tree .
.
├── my
│ ├── inaccessible.rs
│ └── nested.rs
├── my.rs
└── split.rs
In split.rs
:
// This declaration will look for a file named `my.rs` and will
// insert its contents inside a module named `my` under this scope
mod my;
fn function() {
println!("called `function()`");
}
fn main() {
my::function();
function();
my::indirect_access();
my::nested::function();
}
In my.rs
:
// Similarly `mod inaccessible` and `mod nested` will locate the `nested.rs`
// and `inaccessible.rs` files and insert them here under their respective
// modules
mod inaccessible;
pub mod nested;
pub fn function() {
println!("called `my::function()`");
}
fn private_function() {
println!("called `my::private_function()`");
}
pub fn indirect_access() {
print!("called `my::indirect_access()`, that\n> ");
private_function();
}
In my/nested.rs
:
pub fn function() {
println!("called `my::nested::function()`");
}
#[allow(dead_code)]
fn private_function() {
println!("called `my::nested::private_function()`");
}
In my/inaccessible.rs
:
#[allow(dead_code)]
pub fn public_function() {
println!("called `my::inaccessible::public_function()`");
}
Let's check that things still work as before:
$ rustc split.rs && ./split
called `my::function()`
called `function()`
called `my::indirect_access()`, that
> called `my::private_function()`
called `my::nested::function()`
Crates
A crate is a compilation unit in Rust. Whenever rustc some_file.rs
is called, some_file.rs
is treated as the crate file. If some_file.rs
has mod
declarations in it, then the contents of the module files would be inserted in places where mod
declarations in the crate file are found, before running the compiler over it. In other words, modules do not get compiled individually, only crates get compiled.
A crate can be compiled into a binary or into a library. By default, rustc
will produce a binary from a crate. This behavior can be overridden by passing the --crate-type
flag to lib
.
Creating a Library
Let's create a library, and then see how to link it to another crate.
In rary.rs
:
pub fn public_function() {
println!("called rary's `public_function()`");
}
fn private_function() {
println!("called rary's `private_function()`");
}
pub fn indirect_access() {
print!("called rary's `indirect_access()`, that\n> ");
private_function();
}
$ rustc --crate-type=lib rary.rs
$ ls lib*
library.rlib
Libraries get prefixed with "lib", and by default they get named after their crate file, but this default name can be overridden by passing the --crate-name
option to rustc
or by using the crate_name
attribute.
Using a Library
To link a crate to this new library you may use rustc
's --extern
flag. All of its items will then be imported under a module named the same as the library. This module generally behaves the same way as any other module.
// extern crate rary; // May be required for Rust 2015 edition or earlier
fn main() {
rary::public_function();
// Error! `private_function` is private
//rary::private_function();
rary::indirect_access();
}
# Where library.rlib is the path to the compiled library, assumed that it's
# in the same directory here:
$ rustc executable.rs --extern rary=library.rlib && ./executable
called rary's `public_function()`
called rary's `indirect_access()`, that
> called rary's `private_function()`
Cargo
cargo
is the official Rust package management tool. It has lots of really useful features to improve code quality and developer velocity! These include:
- Dependency management and integration with crates.io (the official Rust package registry)
- Awareness of unit tests
- Awareness of benchmarks
This chapter will go through some quick basics, but you can find the comprehensive docs in The Cargo Book.
Dependencies
Most programs have dependencies on some libraries. If you have ever managed dependencies by hand, you know how much of a pain this can be. Luckily, the Rust ecosystem comes standard with cargo
! cargo
can manage dependencies for a project.
To create a new Rust project,
# A binary
cargo new foo
# A library
cargo new --lib bar
For the rest of this chapter, let's assume we are making a binary, rather than a library, but all of the concepts are the same.
After the above commands, you should see a file hierarchy like this:
.
├── bar
│ ├── Cargo.toml
│ └── src
│ └── lib.rs
└── foo
├── Cargo.toml
└── src
└── main.rs
The main.rs
is the root source file for your new foo
project -- nothing new there. The Cargo.toml
is the config file for cargo
for this project. If you look inside it, you should see something like this:
[package]
name = "foo"
version = "0.1.0"
authors = ["mark"]
[dependencies]
The name
field under [package]
determines the name of the project. This is used by crates.io
if you publish the crate (more later). It is also the name of the output binary when you compile.
The version
field is a crate version number using Semantic Versioning.
The authors
field is a list of authors used when publishing the crate.
The [dependencies]
section lets you add dependencies for your project.
For example, suppose that we want our program to have a great CLI. You can find lots of great packages on crates.io (the official Rust package registry). One popular choice is clap. As of this writing, the most recent published version of clap
is 2.27.1
. To add a dependency to our program, we can simply add the following to our Cargo.toml
under [dependencies]
: clap = "2.27.1"
. And that's it! You can start using clap
in your program.
cargo
also supports other types of dependencies. Here is just a small sampling:
[package]
name = "foo"
version = "0.1.0"
authors = ["mark"]
[dependencies]
clap = "2.27.1" # from crates.io
rand = { git = "https://github.com/rust-lang-nursery/rand" } # from online repo
bar = { path = "../bar" } # from a path in the local filesystem
cargo
is more than a dependency manager. All of the available configuration options are listed in the format specification of Cargo.toml
.
To build our project we can execute cargo build
anywhere in the project directory (including subdirectories!). We can also do cargo run
to build and run. Notice that these commands will resolve all dependencies, download crates if needed, and build everything, including your crate. (Note that it only rebuilds what it has not already built, similar to make
).
Voila! That's all there is to it!
Conventions
In the previous chapter, we saw the following directory hierarchy:
foo
├── Cargo.toml
└── src
└── main.rs
Suppose that we wanted to have two binaries in the same project, though. What then?
It turns out that cargo
supports this. The default binary name is main
, as we saw before, but you can add additional binaries by placing them in a bin/
directory:
foo
├── Cargo.toml
└── src
├── main.rs
└── bin
└── my_other_bin.rs
To tell cargo
to only compile or run this binary, we just pass cargo
the --bin my_other_bin
flag, where my_other_bin
is the name of the binary we want to work with.
In addition to extra binaries, cargo
supports more features such as benchmarks, tests, and examples.
In the next chapter, we will look more closely at tests.
Testing
As we know testing is integral to any piece of software! Rust has first-class support for unit and integration testing (see this chapter in TRPL).
From the testing chapters linked above, we see how to write unit tests and integration tests. Organizationally, we can place unit tests in the modules they test and integration tests in their own tests/
directory:
foo
├── Cargo.toml
├── src
│ └── main.rs
│ └── lib.rs
└── tests
├── my_test.rs
└── my_other_test.rs
Each file in tests
is a separate integration test, i.e. a test that is meant to test your library as if it were being called from a dependent crate.
The Testing chapter elaborates on the three different testing styles: Unit, Doc, and Integration.
cargo
naturally provides an easy way to run all of your tests!
$ cargo test
You should see output like this:
$ cargo test
Compiling blah v0.1.0 (file:///nobackup/blah)
Finished dev [unoptimized + debuginfo] target(s) in 0.89 secs
Running target/debug/deps/blah-d3b32b97275ec472
running 3 tests
test test_bar ... ok
test test_baz ... ok
test test_foo_bar ... ok
test test_foo ... ok
test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
You can also run tests whose name matches a pattern:
$ cargo test test_foo
$ cargo test test_foo
Compiling blah v0.1.0 (file:///nobackup/blah)
Finished dev [unoptimized + debuginfo] target(s) in 0.35 secs
Running target/debug/deps/blah-d3b32b97275ec472
running 2 tests
test test_foo ... ok
test test_foo_bar ... ok
test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 2 filtered out
One word of caution: Cargo may run multiple tests concurrently, so make sure that they don't race with each other.
One example of this concurrency causing issues is if two tests output to a file, such as below:
#![allow(unused)] fn main() { #[cfg(test)] mod tests { // Import the necessary modules use std::fs::OpenOptions; use std::io::Write; // This test writes to a file #[test] fn test_file() { // Opens the file ferris.txt or creates one if it doesn't exist. let mut file = OpenOptions::new() .append(true) .create(true) .open("ferris.txt") .expect("Failed to open ferris.txt"); // Print "Ferris" 5 times. for _ in 0..5 { file.write_all("Ferris\n".as_bytes()) .expect("Could not write to ferris.txt"); } } // This test tries to write to the same file #[test] fn test_file_also() { // Opens the file ferris.txt or creates one if it doesn't exist. let mut file = OpenOptions::new() .append(true) .create(true) .open("ferris.txt") .expect("Failed to open ferris.txt"); // Print "Corro" 5 times. for _ in 0..5 { file.write_all("Corro\n".as_bytes()) .expect("Could not write to ferris.txt"); } } } }
Although the intent is to get the following:
$ cat ferris.txt
Ferris
Ferris
Ferris
Ferris
Ferris
Corro
Corro
Corro
Corro
Corro
What actually gets put into ferris.txt
is this:
$ cargo test test_foo
Corro
Ferris
Corro
Ferris
Corro
Ferris
Corro
Ferris
Corro
Ferris
Build Scripts
Sometimes a normal build from cargo
is not enough. Perhaps your crate needs some pre-requisites before cargo
will successfully compile, things like code generation, or some native code that needs to be compiled. To solve this problem we have build scripts that Cargo can run.
To add a build script to your package it can either be specified in the Cargo.toml
as follows:
[package]
...
build = "build.rs"
Otherwise Cargo will look for a build.rs
file in the project directory by
default.
How to use a build script
The build script is simply another Rust file that will be compiled and invoked prior to compiling anything else in the package. Hence it can be used to fulfill pre-requisites of your crate.
Cargo provides the script with inputs via environment variables specified here that can be used.
The script provides output via stdout. All lines printed are written to target/debug/build/<pkg>/output
. Further, lines prefixed with cargo:
will be interpreted by Cargo directly and hence can be used to define parameters for the package's compilation.
For further specification and examples have a read of the Cargo specification.
Attributes
An attribute is metadata applied to some module, crate or item. This metadata can be used to/for:
- conditional compilation of code
- set crate name, version and type (binary or library)
- disable lints (warnings)
- enable compiler features (macros, glob imports, etc.)
- link to a foreign library
- mark functions as unit tests
- mark functions that will be part of a benchmark
- attribute like macros
When attributes apply to a whole crate, their syntax is #![crate_attribute]
, and when they apply to a module or item, the syntax is #[item_attribute]
(notice the missing bang !
).
Attributes can take arguments with different syntaxes:
#[attribute = "value"]
#[attribute(key = "value")]
#[attribute(value)]
Attributes can have multiple values and can be separated over multiple lines, too:
#[attribute(value, value2)]
#[attribute(value, value2, value3,
value4, value5)]
dead_code
The compiler provides a dead_code
lint that will warn about unused functions. An attribute can be used to disable the lint.
fn used_function() {} // `#[allow(dead_code)]` is an attribute that disables the `dead_code` lint #[allow(dead_code)] fn unused_function() {} fn noisy_unused_function() {} // FIXME ^ Add an attribute to suppress the warning fn main() { used_function(); }
Note that in real programs, you should eliminate dead code. In these examples we'll allow dead code in some places because of the interactive nature of the examples.
Crates
The crate_type
attribute can be used to tell the compiler whether a crate is a binary or a library (and even which type of library), and the crate_name
attribute can be used to set the name of the crate.
However, it is important to note that both the crate_type
and crate_name
attributes have no effect whatsoever when using Cargo, the Rust package manager. Since Cargo is used for the majority of Rust projects, this means real-world uses of crate_type
and crate_name
are relatively limited.
// This crate is a library #![crate_type = "lib"] // The library is named "rary" #![crate_name = "rary"] pub fn public_function() { println!("called rary's `public_function()`"); } fn private_function() { println!("called rary's `private_function()`"); } pub fn indirect_access() { print!("called rary's `indirect_access()`, that\n> "); private_function(); }
When the crate_type
attribute is used, we no longer need to pass the --crate-type
flag to rustc
.
$ rustc lib.rs
$ ls lib*
library.rlib
cfg
Configuration conditional checks are possible through two different operators:
- the
cfg
attribute:#[cfg(...)]
in attribute position - the
cfg!
macro:cfg!(...)
in boolean expressions
While the former enables conditional compilation, the latter conditionally evaluates to true
or false
literals allowing for checks at run-time. Both utilize identical argument syntax.
cfg!
, unlike #[cfg]
, does not remove any code and only evaluates to true or false. For example, all blocks in an if/else expression need to be valid when cfg!
is used for the condition, regardless of what cfg!
is evaluating.
// This function only gets compiled if the target OS is linux #[cfg(target_os = "linux")] fn are_you_on_linux() { println!("You are running linux!"); } // And this function only gets compiled if the target OS is *not* linux #[cfg(not(target_os = "linux"))] fn are_you_on_linux() { println!("You are *not* running linux!"); } fn main() { are_you_on_linux(); println!("Are you sure?"); if cfg!(target_os = "linux") { println!("Yes. It's definitely linux!"); } else { println!("Yes. It's definitely *not* linux!"); } }
See also:
the reference, cfg!
, and macros.
Custom
Some conditionals like target_os
are implicitly provided by rustc
, but
custom conditionals must be passed to rustc
using the --cfg
flag.
#[cfg(some_condition)] fn conditional_function() { println!("condition met!"); } fn main() { conditional_function(); }
Try to run this to see what happens without the custom cfg
flag.
With the custom cfg
flag:
$ rustc --cfg some_condition custom.rs && ./custom
condition met!
Generics
Generics is the topic of generalizing types and functionalities to broader cases. This is extremely useful for reducing code duplication in many ways, but can call for rather involved syntax. Namely, being generic requires taking great care to specify over which types a generic type is actually considered valid. The simplest and most common use of generics is for type parameters.
A type parameter is specified as generic by the use of angle brackets and upper camel case: <Aaa, Bbb, ...>
. "Generic type parameters" are typically represented as <T>
. In Rust, "generic" also describes anything that accepts one or more generic type parameters <T>
. Any type specified as a generic type parameter is generic, and everything else is concrete (non-generic).
For example, defining a generic function named foo
that takes an argument T
of any type:
fn foo<T>(arg: T) { ... }
Because T
has been specified as a generic type parameter using <T>
, it is considered generic when used here as (arg: T)
. This is the case even if T
has previously been defined as a struct
.
This example shows some of the syntax in action:
// A concrete type `A`. struct A; // In defining the type `Single`, the first use of `A` is not preceded by `<A>`. // Therefore, `Single` is a concrete type, and `A` is defined as above. struct Single(A); // ^ Here is `Single`s first use of the type `A`. // Here, `<T>` precedes the first use of `T`, so `SingleGen` is a generic type. // Because the type parameter `T` is generic, it could be anything, including // the concrete type `A` defined at the top. struct SingleGen<T>(T); fn main() { // `Single` is concrete and explicitly takes `A`. let _s = Single(A); // Create a variable `_char` of type `SingleGen<char>` // and give it the value `SingleGen('a')`. // Here, `SingleGen` has a type parameter explicitly specified. let _char: SingleGen<char> = SingleGen('a'); // `SingleGen` can also have a type parameter implicitly specified: let _t = SingleGen(A); // Uses `A` defined at the top. let _i32 = SingleGen(6); // Uses `i32`. let _char = SingleGen('a'); // Uses `char`. }
See also:
Functions
The same set of rules can be applied to functions: a type T
becomes generic when preceded by <T>
.
Using generic functions sometimes requires explicitly specifying type parameters. This may be the case if the function is called where the return type is generic, or if the compiler doesn't have enough information to infer the necessary type parameters.
A function call with explicitly specified type parameters looks like:
fun::<A, B, ...>()
.
struct A; // Concrete type `A`. struct S(A); // Concrete type `S`. struct SGen<T>(T); // Generic type `SGen`. // The following functions all take ownership of the variable passed into // them and immediately go out of scope, freeing the variable. // Define a function `reg_fn` that takes an argument `_s` of type `S`. // This has no `<T>` so this is not a generic function. fn reg_fn(_s: S) {} // Define a function `gen_spec_t` that takes an argument `_s` of type `SGen<T>`. // It has been explicitly given the type parameter `A`, but because `A` has not // been specified as a generic type parameter for `gen_spec_t`, it is not generic. fn gen_spec_t(_s: SGen<A>) {} // Define a function `gen_spec_i32` that takes an argument `_s` of type `SGen<i32>`. // It has been explicitly given the type parameter `i32`, which is a specific type. // Because `i32` is not a generic type, this function is also not generic. fn gen_spec_i32(_s: SGen<i32>) {} // Define a function `generic` that takes an argument `_s` of type `SGen<T>`. // Because `SGen<T>` is preceded by `<T>`, this function is generic over `T`. fn generic<T>(_s: SGen<T>) {} fn main() { // Using the non-generic functions reg_fn(S(A)); // Concrete type. gen_spec_t(SGen(A)); // Implicitly specified type parameter `A`. gen_spec_i32(SGen(6)); // Implicitly specified type parameter `i32`. // Explicitly specified type parameter `char` to `generic()`. generic::<char>(SGen('a')); // Implicitly specified type parameter `char` to `generic()`. generic(SGen('c')); }
See also:
Implementation
Similar to functions, implementations require care to remain generic.
#![allow(unused)] fn main() { struct S; // Concrete type `S` struct GenericVal<T>(T); // Generic type `GenericVal` // impl of GenericVal where we explicitly specify type parameters: impl GenericVal<f32> {} // Specify `f32` impl GenericVal<S> {} // Specify `S` as defined above // `<T>` Must precede the type to remain generic impl<T> GenericVal<T> {} }
struct Val { val: f64, } struct GenVal<T> { gen_val: T, } // impl of Val impl Val { fn value(&self) -> &f64 { &self.val } } // impl of GenVal for a generic type `T` impl<T> GenVal<T> { fn value(&self) -> &T { &self.gen_val } } fn main() { let x = Val { val: 3.0 }; let y = GenVal { gen_val: 3i32 }; println!("{}, {}", x.value(), y.value()); }
See also:
functions returning references, impl
, and struct
Traits
Of course trait
s can also be generic. Here we define one which reimplements the Drop
trait
as a generic method to drop
itself and an input.
// Non-copyable types. struct Empty; struct Null; // A trait generic over `T`. trait DoubleDrop<T> { // Define a method on the caller type which takes an // additional single parameter `T` and does nothing with it. fn double_drop(self, _: T); } // Implement `DoubleDrop<T>` for any generic parameter `T` and // caller `U`. impl<T, U> DoubleDrop<T> for U { // This method takes ownership of both passed arguments, // deallocating both. fn double_drop(self, _: T) {} } fn main() { let empty = Empty; let null = Null; // Deallocate `empty` and `null`. empty.double_drop(null); //empty; //null; // ^ TODO: Try uncommenting these lines. }
See also:
Bounds
When working with generics, the type parameters often must use traits as bounds to stipulate what functionality a type implements. For example, the following example uses the trait Display
to print and so it requires T
to be bound by Display
; that is, T
must implement Display
.
// Define a function `printer` that takes a generic type `T` which
// must implement trait `Display`.
fn printer<T: Display>(t: T) {
println!("{}", t);
}
Bounding restricts the generic to types that conform to the bounds. That is:
struct S<T: Display>(T);
// Error! `Vec<T>` does not implement `Display`. This
// specialization will fail.
let s = S(vec![1]);
Another effect of bounding is that generic instances are allowed to access the methods of traits specified in the bounds. For example:
// A trait which implements the print marker: `{:?}`. use std::fmt::Debug; trait HasArea { fn area(&self) -> f64; } impl HasArea for Rectangle { fn area(&self) -> f64 { self.length * self.height } } #[derive(Debug)] struct Rectangle { length: f64, height: f64 } #[allow(dead_code)] struct Triangle { length: f64, height: f64 } // The generic `T` must implement `Debug`. Regardless // of the type, this will work properly. fn print_debug<T: Debug>(t: &T) { println!("{:?}", t); } // `T` must implement `HasArea`. Any type which meets // the bound can access `HasArea`'s function `area`. fn area<T: HasArea>(t: &T) -> f64 { t.area() } fn main() { let rectangle = Rectangle { length: 3.0, height: 4.0 }; let _triangle = Triangle { length: 3.0, height: 4.0 }; print_debug(&rectangle); println!("Area: {}", area(&rectangle)); //print_debug(&_triangle); //println!("Area: {}", area(&_triangle)); // ^ TODO: Try uncommenting these. // | Error: Does not implement either `Debug` or `HasArea`. }
As an additional note, where
clauses can also be used to apply bounds in some cases to be more expressive.
See also:
Testcase: empty bounds
A consequence of how bounds work is that even if a trait
doesn't include any functionality, you can still use it as a bound. Eq
and Copy
are examples of such trait
s from the std
library.
struct Cardinal; struct BlueJay; struct Turkey; trait Red {} trait Blue {} impl Red for Cardinal {} impl Blue for BlueJay {} // These functions are only valid for types which implement these // traits. The fact that the traits are empty is irrelevant. fn red<T: Red>(_: &T) -> &'static str { "red" } fn blue<T: Blue>(_: &T) -> &'static str { "blue" } fn main() { let cardinal = Cardinal; let blue_jay = BlueJay; let _turkey = Turkey; // `red()` won't work on a blue jay nor vice versa // because of the bounds. println!("A cardinal is {}", red(&cardinal)); println!("A blue jay is {}", blue(&blue_jay)); //println!("A turkey is {}", red(&_turkey)); // ^ TODO: Try uncommenting this line. }
See also:
std::cmp::Eq
, std::marker::Copy
, and trait
s
Multiple bounds
Multiple bounds for a single type can be applied with a +
. Like normal, different types are separated with ,
.
use std::fmt::{Debug, Display}; fn compare_prints<T: Debug + Display>(t: &T) { println!("Debug: `{:?}`", t); println!("Display: `{}`", t); } fn compare_types<T: Debug, U: Debug>(t: &T, u: &U) { println!("t: `{:?}`", t); println!("u: `{:?}`", u); } fn main() { let string = "words"; let array = [1, 2, 3]; let vec = vec![1, 2, 3]; compare_prints(&string); //compare_prints(&array); // TODO ^ Try uncommenting this. compare_types(&array, &vec); }
See also:
Where clauses
A bound can also be expressed using a where
clause immediately before the opening {
, rather than at the type's first mention. Additionally, where
clauses can apply bounds to arbitrary types, rather than just to type parameters.
Some cases that a where
clause is useful:
- When specifying generic types and bounds separately is clearer:
impl <A: TraitB + TraitC, D: TraitE + TraitF> MyTrait<A, D> for YourType {}
// Expressing bounds with a `where` clause
impl <A, D> MyTrait<A, D> for YourType where
A: TraitB + TraitC,
D: TraitE + TraitF {}
- When using a
where
clause is more expressive than using normal syntax. Theimpl
in this example cannot be directly expressed without awhere
clause:
use std::fmt::Debug; trait PrintInOption { fn print_in_option(self); } // Because we would otherwise have to express this as `T: Debug` or // use another method of indirect approach, this requires a `where` clause: impl<T> PrintInOption for T where Option<T>: Debug { // We want `Option<T>: Debug` as our bound because that is what's // being printed. Doing otherwise would be using the wrong bound. fn print_in_option(self) { println!("{:?}", Some(self)); } } fn main() { let vec = vec![1, 2, 3]; vec.print_in_option(); }
See also:
New Type Idiom
The newtype
idiom gives compile time guarantees that the right type of value is supplied to a program.
For example, an age verification function that checks age in years, must be given a value of type Years
.
struct Years(i64); struct Days(i64); impl Years { pub fn to_days(&self) -> Days { Days(self.0 * 365) } } impl Days { /// truncates partial years pub fn to_years(&self) -> Years { Years(self.0 / 365) } } fn old_enough(age: &Years) -> bool { age.0 >= 18 } fn main() { let age = Years(5); let age_days = age.to_days(); println!("Old enough {}", old_enough(&age)); println!("Old enough {}", old_enough(&age_days.to_years())); // println!("Old enough {}", old_enough(&age_days)); }
Uncomment the last print statement to observe that the type supplied must be Years
.
To obtain the newtype
's value as the base type, you may use the tuple or destructuring syntax like so:
struct Years(i64); fn main() { let years = Years(42); let years_as_primitive_1: i64 = years.0; // Tuple let Years(years_as_primitive_2) = years; // Destructuring }
See also:
Associated items
"Associated Items" refers to a set of rules pertaining to item
s of various types. It is an extension to trait
generics, and allows trait
s to internally define new items.
One such item is called an associated type, providing simpler usage patterns when the trait
is generic over its container type.
See also:
The Problem
A trait
that is generic over its container type has type specification requirements - users of the trait
must specify all of its generic types.
In the example below, the Contains
trait
allows the use of the generic types A
and B
. The trait is then implemented for the Container
type, specifying i32
for A
and B
so that it can be used with fn difference()
.
Because Contains
is generic, we are forced to explicitly state all of the generic types for fn difference()
. In practice, we want a way to express that A
and B
are determined by the input C
. As you will see in the next section, associated types provide exactly that capability.
struct Container(i32, i32); // A trait which checks if 2 items are stored inside of container. // Also retrieves first or last value. trait Contains<A, B> { fn contains(&self, _: &A, _: &B) -> bool; // Explicitly requires `A` and `B`. fn first(&self) -> i32; // Doesn't explicitly require `A` or `B`. fn last(&self) -> i32; // Doesn't explicitly require `A` or `B`. } impl Contains<i32, i32> for Container { // True if the numbers stored are equal. fn contains(&self, number_1: &i32, number_2: &i32) -> bool { (&self.0 == number_1) && (&self.1 == number_2) } // Grab the first number. fn first(&self) -> i32 { self.0 } // Grab the last number. fn last(&self) -> i32 { self.1 } } // `C` contains `A` and `B`. In light of that, having to express `A` and // `B` again is a nuisance. fn difference<A, B, C>(container: &C) -> i32 where C: Contains<A, B> { container.last() - container.first() } fn main() { let number_1 = 3; let number_2 = 10; let container = Container(number_1, number_2); println!("Does container contain {} and {}: {}", &number_1, &number_2, container.contains(&number_1, &number_2)); println!("First number: {}", container.first()); println!("Last number: {}", container.last()); println!("The difference is: {}", difference(&container)); }
See also:
Associated types
The use of "Associated types" improves the overall readability of code by moving inner types locally into a trait as output types. Syntax for the trait
definition is as follows:
#![allow(unused)] fn main() { // `A` and `B` are defined in the trait via the `type` keyword. // (Note: `type` in this context is different from `type` when used for // aliases). trait Contains { type A; type B; // Updated syntax to refer to these new types generically. fn contains(&self, _: &Self::A, _: &Self::B) -> bool; } }
Note that functions that use the trait
Contains
are no longer required to express A
or B
at all:
// Without using associated types
fn difference<A, B, C>(container: &C) -> i32 where
C: Contains<A, B> { ... }
// Using associated types
fn difference<C: Contains>(container: &C) -> i32 { ... }
Let's rewrite the example from the previous section using associated types:
struct Container(i32, i32); // A trait which checks if 2 items are stored inside of container. // Also retrieves first or last value. trait Contains { // Define generic types here which methods will be able to utilize. type A; type B; fn contains(&self, _: &Self::A, _: &Self::B) -> bool; fn first(&self) -> i32; fn last(&self) -> i32; } impl Contains for Container { // Specify what types `A` and `B` are. If the `input` type // is `Container(i32, i32)`, the `output` types are determined // as `i32` and `i32`. type A = i32; type B = i32; // `&Self::A` and `&Self::B` are also valid here. fn contains(&self, number_1: &i32, number_2: &i32) -> bool { (&self.0 == number_1) && (&self.1 == number_2) } // Grab the first number. fn first(&self) -> i32 { self.0 } // Grab the last number. fn last(&self) -> i32 { self.1 } } fn difference<C: Contains>(container: &C) -> i32 { container.last() - container.first() } fn main() { let number_1 = 3; let number_2 = 10; let container = Container(number_1, number_2); println!("Does container contain {} and {}: {}", &number_1, &number_2, container.contains(&number_1, &number_2)); println!("First number: {}", container.first()); println!("Last number: {}", container.last()); println!("The difference is: {}", difference(&container)); }
Phantom type parameters
A phantom type parameter is one that doesn't show up at runtime, but is checked statically (and only) at compile time.
Data types can use extra generic type parameters to act as markers or to perform type checking at compile time. These extra parameters hold no storage values, and have no runtime behavior.
In the following example, we combine std::marker::PhantomData with the phantom type parameter concept to create tuples containing different data types.
use std::marker::PhantomData; // A phantom tuple struct which is generic over `A` with hidden parameter `B`. #[derive(PartialEq)] // Allow equality test for this type. struct PhantomTuple<A, B>(A, PhantomData<B>); // A phantom type struct which is generic over `A` with hidden parameter `B`. #[derive(PartialEq)] // Allow equality test for this type. struct PhantomStruct<A, B> { first: A, phantom: PhantomData<B> } // Note: Storage is allocated for generic type `A`, but not for `B`. // Therefore, `B` cannot be used in computations. fn main() { // Here, `f32` and `f64` are the hidden parameters. // PhantomTuple type specified as `<char, f32>`. let _tuple1: PhantomTuple<char, f32> = PhantomTuple('Q', PhantomData); // PhantomTuple type specified as `<char, f64>`. let _tuple2: PhantomTuple<char, f64> = PhantomTuple('Q', PhantomData); // Type specified as `<char, f32>`. let _struct1: PhantomStruct<char, f32> = PhantomStruct { first: 'Q', phantom: PhantomData, }; // Type specified as `<char, f64>`. let _struct2: PhantomStruct<char, f64> = PhantomStruct { first: 'Q', phantom: PhantomData, }; // Compile-time Error! Type mismatch so these cannot be compared: // println!("_tuple1 == _tuple2 yields: {}", // _tuple1 == _tuple2); // Compile-time Error! Type mismatch so these cannot be compared: // println!("_struct1 == _struct2 yields: {}", // _struct1 == _struct2); }
See also:
Derive, struct, and TupleStructs
Testcase: unit clarification
A useful method of unit conversions can be examined by implementing Add
with a phantom type parameter. The Add
trait
is examined below:
// This construction would impose: `Self + RHS = Output`
// where RHS defaults to Self if not specified in the implementation.
pub trait Add<RHS = Self> {
type Output;
fn add(self, rhs: RHS) -> Self::Output;
}
// `Output` must be `T<U>` so that `T<U> + T<U> = T<U>`.
impl<U> Add for T<U> {
type Output = T<U>;
...
}
The whole implementation:
use std::ops::Add; use std::marker::PhantomData; /// Create void enumerations to define unit types. #[derive(Debug, Clone, Copy)] enum Inch {} #[derive(Debug, Clone, Copy)] enum Mm {} /// `Length` is a type with phantom type parameter `Unit`, /// and is not generic over the length type (that is `f64`). /// /// `f64` already implements the `Clone` and `Copy` traits. #[derive(Debug, Clone, Copy)] struct Length<Unit>(f64, PhantomData<Unit>); /// The `Add` trait defines the behavior of the `+` operator. impl<Unit> Add for Length<Unit> { type Output = Length<Unit>; // add() returns a new `Length` struct containing the sum. fn add(self, rhs: Length<Unit>) -> Length<Unit> { // `+` calls the `Add` implementation for `f64`. Length(self.0 + rhs.0, PhantomData) } } fn main() { // Specifies `one_foot` to have phantom type parameter `Inch`. let one_foot: Length<Inch> = Length(12.0, PhantomData); // `one_meter` has phantom type parameter `Mm`. let one_meter: Length<Mm> = Length(1000.0, PhantomData); // `+` calls the `add()` method we implemented for `Length<Unit>`. // // Since `Length` implements `Copy`, `add()` does not consume // `one_foot` and `one_meter` but copies them into `self` and `rhs`. let two_feet = one_foot + one_foot; let two_meters = one_meter + one_meter; // Addition works. println!("one foot + one_foot = {:?} in", two_feet.0); println!("one meter + one_meter = {:?} mm", two_meters.0); // Nonsensical operations fail as they should: // Compile-time Error: type mismatch. //let one_feter = one_foot + one_meter; }
See also:
Borrowing (&
), Bounds (X: Y
), enum, impl & self,
Overloading, ref, Traits (X for Y
), and TupleStructs.
Scoping rules
Scopes play an important part in ownership, borrowing, and lifetimes. That is, they indicate to the compiler when borrows are valid, when resources can be freed, and when variables are created or destroyed.
RAII
Variables in Rust do more than just hold data in the stack: they also own resources, e.g. Box<T>
owns memory in the heap. Rust enforces RAII (Resource Acquisition Is Initialization), so whenever an object goes out of scope, its destructor is called and its owned resources are freed.
This behavior shields against resource leak bugs, so you'll never have to manually free memory or worry about memory leaks again! Here's a quick showcase:
// raii.rs fn create_box() { // Allocate an integer on the heap let _box1 = Box::new(3i32); // `_box1` is destroyed here, and memory gets freed } fn main() { // Allocate an integer on the heap let _box2 = Box::new(5i32); // A nested scope: { // Allocate an integer on the heap let _box3 = Box::new(4i32); // `_box3` is destroyed here, and memory gets freed } // Creating lots of boxes just for fun // There's no need to manually free memory! for _ in 0u32..1_000 { create_box(); } // `_box2` is destroyed here, and memory gets freed }
Of course, we can double check for memory errors using valgrind
:
$ rustc raii.rs && valgrind ./raii
==26873== Memcheck, a memory error detector
==26873== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==26873== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
==26873== Command: ./raii
==26873==
==26873==
==26873== HEAP SUMMARY:
==26873== in use at exit: 0 bytes in 0 blocks
==26873== total heap usage: 1,013 allocs, 1,013 frees, 8,696 bytes allocated
==26873==
==26873== All heap blocks were freed -- no leaks are possible
==26873==
==26873== For counts of detected and suppressed errors, rerun with: -v
==26873== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
No leaks here!
Destructor
The notion of a destructor in Rust is provided through the Drop
trait. The destructor is called when the resource goes out of scope. This trait is not required to be implemented for every type, only implement it for your type if you require its own destructor logic.
Run the below example to see how the Drop
trait works. When the variable in the main
function goes out of scope the custom destructor will be invoked.
struct ToDrop; impl Drop for ToDrop { fn drop(&mut self) { println!("ToDrop is being dropped"); } } fn main() { let x = ToDrop; println!("Made a ToDrop!"); }
See also:
Ownership and moves
Because variables are in charge of freeing their own resources, resources can only have one owner. This also prevents resources from being freed more than once. Note that not all variables own resources (e.g. references).
When doing assignments (let x = y
) or passing function arguments by value (foo(x)
), the ownership of the resources is transferred. In Rust-speak, this is known as a move.
After moving resources, the previous owner can no longer be used. This avoids creating dangling pointers.
// This function takes ownership of the heap allocated memory fn destroy_box(c: Box<i32>) { println!("Destroying a box that contains {}", c); // `c` is destroyed and the memory freed } fn main() { // _Stack_ allocated integer let x = 5u32; // *Copy* `x` into `y` - no resources are moved let y = x; // Both values can be independently used println!("x is {}, and y is {}", x, y); // `a` is a pointer to a _heap_ allocated integer let a = Box::new(5i32); println!("a contains: {}", a); // *Move* `a` into `b` let b = a; // The pointer address of `a` is copied (not the data) into `b`. // Both are now pointers to the same heap allocated data, but // `b` now owns it. // Error! `a` can no longer access the data, because it no longer owns the // heap memory //println!("a contains: {}", a); // TODO ^ Try uncommenting this line // This function takes ownership of the heap allocated memory from `b` destroy_box(b); // Since the heap memory has been freed at this point, this action would // result in dereferencing freed memory, but it's forbidden by the compiler // Error! Same reason as the previous Error //println!("b contains: {}", b); // TODO ^ Try uncommenting this line }
Mutability
Mutability of data can be changed when ownership is transferred.
fn main() { let immutable_box = Box::new(5u32); println!("immutable_box contains {}", immutable_box); // Mutability error //*immutable_box = 4; // *Move* the box, changing the ownership (and mutability) let mut mutable_box = immutable_box; println!("mutable_box contains {}", mutable_box); // Modify the contents of the box *mutable_box = 4; println!("mutable_box now contains {}", mutable_box); }
Partial moves
Within the destructuring of a single variable, both by-move
and by-reference
pattern bindings can be used at the same time. Doing this will result in a partial move of the variable, which means that parts of the variable will be moved while other parts stay. In such a case, the parent variable cannot be used afterwards as a whole, however the parts that are only referenced (and not moved) can still be used.
fn main() { #[derive(Debug)] struct Person { name: String, age: Box<u8>, } let person = Person { name: String::from("Alice"), age: Box::new(20), }; // `name` is moved out of person, but `age` is referenced let Person { name, ref age } = person; println!("The person's age is {}", age); println!("The person's name is {}", name); // Error! borrow of partially moved value: `person` partial move occurs //println!("The person struct is {:?}", person); // `person` cannot be used but `person.age` can be used as it is not moved println!("The person's age from person struct is {}", person.age); }
(In this example, we store the age
variable on the heap to illustrate the partial move: deleting ref
in the above code would give an error as the ownership of person.age
would be moved to the variable age
. If Person.age
were stored on the stack, ref
would not be required as the definition of age
would copy the data from person.age
without moving it.)
See also:
Borrowing
Most of the time, we'd like to access data without taking ownership over it. To accomplish this, Rust uses a borrowing mechanism. Instead of passing objects by value (T
), objects can be passed by reference (&T
).
The compiler statically guarantees (via its borrow checker) that references always point to valid objects. That is, while references to an object exist, the object cannot be destroyed.
// This function takes ownership of a box and destroys it fn eat_box_i32(boxed_i32: Box<i32>) { println!("Destroying box that contains {}", boxed_i32); } // This function borrows an i32 fn borrow_i32(borrowed_i32: &i32) { println!("This int is: {}", borrowed_i32); } fn main() { // Create a boxed i32, and a stacked i32 let boxed_i32 = Box::new(5_i32); let stacked_i32 = 6_i32; // Borrow the contents of the box. Ownership is not taken, // so the contents can be borrowed again. borrow_i32(&boxed_i32); borrow_i32(&stacked_i32); { // Take a reference to the data contained inside the box let _ref_to_i32: &i32 = &boxed_i32; // Error! // Can't destroy `boxed_i32` while the inner value is borrowed later in scope. eat_box_i32(boxed_i32); // FIXME ^ Comment out this line // Attempt to borrow `_ref_to_i32` after inner value is destroyed borrow_i32(_ref_to_i32); // `_ref_to_i32` goes out of scope and is no longer borrowed. } // `boxed_i32` can now give up ownership to `eat_box` and be destroyed eat_box_i32(boxed_i32); }
Mutability
Mutable data can be mutably borrowed using &mut T
. This is called a mutable reference and gives read/write access to the borrower. In contrast, &T
borrows the data via an immutable reference, and the borrower can read the data but not modify it:
#[allow(dead_code)] #[derive(Clone, Copy)] struct Book { // `&'static str` is a reference to a string allocated in read only memory author: &'static str, title: &'static str, year: u32, } // This function takes a reference to a book fn borrow_book(book: &Book) { println!("I immutably borrowed {} - {} edition", book.title, book.year); } // This function takes a reference to a mutable book and changes `year` to 2014 fn new_edition(book: &mut Book) { book.year = 2014; println!("I mutably borrowed {} - {} edition", book.title, book.year); } fn main() { // Create an immutable Book named `immutabook` let immutabook = Book { // string literals have type `&'static str` author: "Douglas Hofstadter", title: "Gödel, Escher, Bach", year: 1979, }; // Create a mutable copy of `immutabook` and call it `mutabook` let mut mutabook = immutabook; // Immutably borrow an immutable object borrow_book(&immutabook); // Immutably borrow a mutable object borrow_book(&mutabook); // Borrow a mutable object as mutable new_edition(&mut mutabook); // Error! Cannot borrow an immutable object as mutable new_edition(&mut immutabook); // FIXME ^ Comment out this line }
See also:
Aliasing
Data can be immutably borrowed any number of times, but while immutably borrowed, the original data can't be mutably borrowed. On the other hand, only one mutable borrow is allowed at a time. The original data can be borrowed again only after the mutable reference has been used for the last time.
struct Point { x: i32, y: i32, z: i32 } fn main() { let mut point = Point { x: 0, y: 0, z: 0 }; let borrowed_point = &point; let another_borrow = &point; // Data can be accessed via the references and the original owner println!("Point has coordinates: ({}, {}, {})", borrowed_point.x, another_borrow.y, point.z); // Error! Can't borrow `point` as mutable because it's currently // borrowed as immutable. // let mutable_borrow = &mut point; // TODO ^ Try uncommenting this line // The borrowed values are used again here println!("Point has coordinates: ({}, {}, {})", borrowed_point.x, another_borrow.y, point.z); // The immutable references are no longer used for the rest of the code so // it is possible to reborrow with a mutable reference. let mutable_borrow = &mut point; // Change data via mutable reference mutable_borrow.x = 5; mutable_borrow.y = 2; mutable_borrow.z = 1; // Error! Can't borrow `point` as immutable because it's currently // borrowed as mutable. // let y = &point.y; // TODO ^ Try uncommenting this line // Error! Can't print because `println!` takes an immutable reference. // println!("Point Z coordinate is {}", point.z); // TODO ^ Try uncommenting this line // Ok! Mutable references can be passed as immutable to `println!` println!("Point has coordinates: ({}, {}, {})", mutable_borrow.x, mutable_borrow.y, mutable_borrow.z); // The mutable reference is no longer used for the rest of the code so it // is possible to reborrow let new_borrowed_point = &point; println!("Point now has coordinates: ({}, {}, {})", new_borrowed_point.x, new_borrowed_point.y, new_borrowed_point.z); }
The ref pattern
When doing pattern matching or destructuring via the let
binding, the ref
keyword can be used to take references to the fields of a struct/tuple. The example below shows a few instances where this can be useful:
#[derive(Clone, Copy)] struct Point { x: i32, y: i32 } fn main() { let c = 'Q'; // A `ref` borrow on the left side of an assignment is equivalent to // an `&` borrow on the right side. let ref ref_c1 = c; let ref_c2 = &c; println!("ref_c1 equals ref_c2: {}", *ref_c1 == *ref_c2); let point = Point { x: 0, y: 0 }; // `ref` is also valid when destructuring a struct. let _copy_of_x = { // `ref_to_x` is a reference to the `x` field of `point`. let Point { x: ref ref_to_x, y: _ } = point; // Return a copy of the `x` field of `point`. *ref_to_x }; // A mutable copy of `point` let mut mutable_point = point; { // `ref` can be paired with `mut` to take mutable references. let Point { x: _, y: ref mut mut_ref_to_y } = mutable_point; // Mutate the `y` field of `mutable_point` via a mutable reference. *mut_ref_to_y = 1; } println!("point is ({}, {})", point.x, point.y); println!("mutable_point is ({}, {})", mutable_point.x, mutable_point.y); // A mutable tuple that includes a pointer let mut mutable_tuple = (Box::new(5u32), 3u32); { // Destructure `mutable_tuple` to change the value of `last`. let (_, ref mut last) = mutable_tuple; *last = 2u32; } println!("tuple is {:?}", mutable_tuple); }
Lifetimes
A lifetime is a construct of the compiler (or more specifically, its borrow checker) uses to ensure all borrows are valid. Specifically, a variable's lifetime begins when it is created and ends when it is destroyed. While lifetimes and scopes are often referred to together, they are not the same.
Take, for example, the case where we borrow a variable via &
. The borrow has a lifetime that is determined by where it is declared. As a result, the borrow is valid as long as it ends before the lender is destroyed. However, the scope of the borrow is determined by where the reference is used.
In the following example and in the rest of this section, we will see how lifetimes relate to scopes, as well as how the two differ.
// Lifetimes are annotated below with lines denoting the creation // and destruction of each variable. // `i` has the longest lifetime because its scope entirely encloses // both `borrow1` and `borrow2`. The duration of `borrow1` compared // to `borrow2` is irrelevant since they are disjoint. fn main() { let i = 3; // Lifetime for `i` starts. ────────────────┐ // │ { // │ let borrow1 = &i; // `borrow1` lifetime starts. ──┐│ // ││ println!("borrow1: {}", borrow1); // ││ } // `borrow1` ends. ─────────────────────────────────┘│ // │ // │ { // │ let borrow2 = &i; // `borrow2` lifetime starts. ──┐│ // ││ println!("borrow2: {}", borrow2); // ││ } // `borrow2` ends. ─────────────────────────────────┘│ // │ } // Lifetime ends. ─────────────────────────────────────┘
Note that no names or types are assigned to label lifetimes. This restricts how lifetimes will be able to be used as we will see.
Explicit annotation
The borrow checker uses explicit lifetime annotations to determine how long references should be valid. In cases where lifetimes are not elided1, Rust requires explicit annotations to determine what the lifetime of a reference should be. The syntax for explicitly annotating a lifetime uses an apostrophe character as follows:
foo<'a>
// `foo` has a lifetime parameter `'a`
Similar to closures, using lifetimes requires generics. Additionally, this lifetime syntax indicates that the lifetime of foo
may not exceed that of 'a
. Explicit annotation of a type has the form &'a T
where 'a
has already been introduced.
In cases with multiple lifetimes, the syntax is similar:
foo<'a, 'b>
// `foo` has lifetime parameters `'a` and `'b`
In this case, the lifetime of foo
cannot exceed that of either 'a
or 'b
.
See the following example for explicit lifetime annotation in use:
// `print_refs` takes two references to `i32` which have different // lifetimes `'a` and `'b`. These two lifetimes must both be at // least as long as the function `print_refs`. fn print_refs<'a, 'b>(x: &'a i32, y: &'b i32) { println!("x is {} and y is {}", x, y); } // A function which takes no arguments, but has a lifetime parameter `'a`. fn failed_borrow<'a>() { let _x = 12; // ERROR: `_x` does not live long enough let y: &'a i32 = &_x; // Attempting to use the lifetime `'a` as an explicit type annotation // inside the function will fail because the lifetime of `&_x` is shorter // than that of `y`. A short lifetime cannot be coerced into a longer one. } fn main() { // Create variables to be borrowed below. let (four, nine) = (4, 9); // Borrows (`&`) of both variables are passed into the function. print_refs(&four, &nine); // Any input which is borrowed must outlive the borrower. // In other words, the lifetime of `four` and `nine` must // be longer than that of `print_refs`. failed_borrow(); // `failed_borrow` contains no references to force `'a` to be // longer than the lifetime of the function, but `'a` is longer. // Because the lifetime is never constrained, it defaults to `'static`. }
elision implicitly annotates lifetimes and so is different.
See also:
Functions
Ignoring elision, function signatures with lifetimes have a few constraints:
- any reference must have an annotated lifetime.
- any reference being returned must have the same lifetime as an input or be
static
.
Additionally, note that returning references without input is banned if it would result in returning references to invalid data. The following example shows off some valid forms of functions with lifetimes:
// One input reference with lifetime `'a` which must live // at least as long as the function. fn print_one<'a>(x: &'a i32) { println!("`print_one`: x is {}", x); } // Mutable references are possible with lifetimes as well. fn add_one<'a>(x: &'a mut i32) { *x += 1; } // Multiple elements with different lifetimes. In this case, it // would be fine for both to have the same lifetime `'a`, but // in more complex cases, different lifetimes may be required. fn print_multi<'a, 'b>(x: &'a i32, y: &'b i32) { println!("`print_multi`: x is {}, y is {}", x, y); } // Returning references that have been passed in is acceptable. // However, the correct lifetime must be returned. fn pass_x<'a, 'b>(x: &'a i32, _: &'b i32) -> &'a i32 { x } //fn invalid_output<'a>() -> &'a String { &String::from("foo") } // The above is invalid: `'a` must live longer than the function. // Here, `&String::from("foo")` would create a `String`, followed by a // reference. Then the data is dropped upon exiting the scope, leaving // a reference to invalid data to be returned. fn main() { let x = 7; let y = 9; print_one(&x); print_multi(&x, &y); let z = pass_x(&x, &y); print_one(z); let mut t = 3; add_one(&mut t); print_one(&t); }
See also:
Methods
Methods are annotated similarly to functions:
struct Owner(i32); impl Owner { // Annotate lifetimes as in a standalone function. fn add_one<'a>(&'a mut self) { self.0 += 1; } fn print<'a>(&'a self) { println!("`print`: {}", self.0); } } fn main() { let mut owner = Owner(18); owner.add_one(); owner.print(); }
See also:
Structs
Annotation of lifetimes in structures are also similar to functions:
// A type `Borrowed` which houses a reference to an // `i32`. The reference to `i32` must outlive `Borrowed`. #[derive(Debug)] struct Borrowed<'a>(&'a i32); // Similarly, both references here must outlive this structure. #[derive(Debug)] struct NamedBorrowed<'a> { x: &'a i32, y: &'a i32, } // An enum which is either an `i32` or a reference to one. #[derive(Debug)] enum Either<'a> { Num(i32), Ref(&'a i32), } fn main() { let x = 18; let y = 15; let single = Borrowed(&x); let double = NamedBorrowed { x: &x, y: &y }; let reference = Either::Ref(&x); let number = Either::Num(y); println!("x is borrowed in {:?}", single); println!("x and y are borrowed in {:?}", double); println!("x is borrowed in {:?}", reference); println!("y is *not* borrowed in {:?}", number); }
See also:
Traits
Annotation of lifetimes in trait methods basically are similar to functions. Note that impl
may have annotation of lifetimes too.
// A struct with annotation of lifetimes. #[derive(Debug)] struct Borrowed<'a> { x: &'a i32, } // Annotate lifetimes to impl. impl<'a> Default for Borrowed<'a> { fn default() -> Self { Self { x: &10, } } } fn main() { let b: Borrowed = Default::default(); println!("b is {:?}", b); }
See also:
Bounds
Just like generic types can be bounded, lifetimes (themselves generic) use bounds as well. The :
character has a slightly different meaning here, but +
is the same. Note how the following read:
T: 'a
: All references inT
must outlive lifetime'a
.T: Trait + 'a
: TypeT
must implement traitTrait
and all references inT
must outlive'a
.
The example below shows the above syntax in action used after keyword where
:
use std::fmt::Debug; // Trait to bound with. #[derive(Debug)] struct Ref<'a, T: 'a>(&'a T); // `Ref` contains a reference to a generic type `T` that has // an unknown lifetime `'a`. `T` is bounded such that any // *references* in `T` must outlive `'a`. Additionally, the lifetime // of `Ref` may not exceed `'a`. // A generic function which prints using the `Debug` trait. fn print<T>(t: T) where T: Debug { println!("`print`: t is {:?}", t); } // Here a reference to `T` is taken where `T` implements // `Debug` and all *references* in `T` outlive `'a`. In // addition, `'a` must outlive the function. fn print_ref<'a, T>(t: &'a T) where T: Debug + 'a { println!("`print_ref`: t is {:?}", t); } fn main() { let x = 7; let ref_x = Ref(&x); print_ref(&ref_x); print(ref_x); }
See also:
generics, bounds in generics, and multiple bounds in generics
Coercion
A longer lifetime can be coerced into a shorter one so that it works inside a scope it normally wouldn't work in. This comes in the form of inferred coercion by the Rust compiler, and also in the form of declaring a lifetime difference:
// Here, Rust infers a lifetime that is as short as possible. // The two references are then coerced to that lifetime. fn multiply<'a>(first: &'a i32, second: &'a i32) -> i32 { first * second } // `<'a: 'b, 'b>` reads as lifetime `'a` is at least as long as `'b`. // Here, we take in an `&'a i32` and return a `&'b i32` as a result of coercion. fn choose_first<'a: 'b, 'b>(first: &'a i32, _: &'b i32) -> &'b i32 { first } fn main() { let first = 2; // Longer lifetime { let second = 3; // Shorter lifetime println!("The product is {}", multiply(&first, &second)); println!("{} is the first", choose_first(&first, &second)); }; }
Static
Rust has a few reserved lifetime names. One of those is 'static
. You might encounter it in two situations:
// A reference with 'static lifetime: let s: &'static str = "hello world"; // 'static as part of a trait bound: fn generic<T>(x: T) where T: 'static {}
Both are related but subtly different and this is a common source for confusion when learning Rust. Here are some examples for each situation:
Reference lifetime
As a reference lifetime 'static
indicates that the data pointed to by the reference lives for the entire lifetime of the running program. It can still be coerced to a shorter lifetime.
There are two ways to make a variable with 'static
lifetime, and both are stored in the read-only memory of the binary:
- Make a constant with the
static
declaration. - Make a
string
literal which has type:&'static str
.
See the following example for a display of each method:
// Make a constant with `'static` lifetime. static NUM: i32 = 18; // Returns a reference to `NUM` where its `'static` // lifetime is coerced to that of the input argument. fn coerce_static<'a>(_: &'a i32) -> &'a i32 { &NUM } fn main() { { // Make a `string` literal and print it: let static_string = "I'm in read-only memory"; println!("static_string: {}", static_string); // When `static_string` goes out of scope, the reference // can no longer be used, but the data remains in the binary. } { // Make an integer to use for `coerce_static`: let lifetime_num = 9; // Coerce `NUM` to lifetime of `lifetime_num`: let coerced_static = coerce_static(&lifetime_num); println!("coerced_static: {}", coerced_static); } println!("NUM: {} stays accessible!", NUM); }
Trait bound
As a trait bound, it means the type does not contain any non-static references. Eg. the receiver can hold on to the type for as long as they want and it will never become invalid until they drop it.
It's important to understand this means that any owned data always passes a 'static
lifetime bound, but a reference to that owned data generally does not:
use std::fmt::Debug; fn print_it( input: impl Debug + 'static ) { println!( "'static value passed in is: {:?}", input ); } fn main() { // i is owned and contains no references, thus it's 'static: let i = 5; print_it(i); // oops, &i only has the lifetime defined by the scope of // main(), so it's not 'static: print_it(&i); }
The compiler will tell you:
error[E0597]: `i` does not live long enough
--> src/lib.rs:15:15
|
15 | print_it(&i);
| ---------^^--
| | |
| | borrowed value does not live long enough
| argument requires that `i` is borrowed for `'static`
16 | }
| - `i` dropped here while still borrowed
See also:
Elision
Some lifetime patterns are overwhelmingly common and so the borrow checker will allow you to omit them to save typing and to improve readability. This is known as elision. Elision exists in Rust solely because these patterns are common.
The following code shows a few examples of elision. For a more comprehensive description of elision, see lifetime elision in the book.
// `elided_input` and `annotated_input` essentially have identical signatures // because the lifetime of `elided_input` is inferred by the compiler: fn elided_input(x: &i32) { println!("`elided_input`: {}", x); } fn annotated_input<'a>(x: &'a i32) { println!("`annotated_input`: {}", x); } // Similarly, `elided_pass` and `annotated_pass` have identical signatures // because the lifetime is added implicitly to `elided_pass`: fn elided_pass(x: &i32) -> &i32 { x } fn annotated_pass<'a>(x: &'a i32) -> &'a i32 { x } fn main() { let x = 3; elided_input(&x); annotated_input(&x); println!("`elided_pass`: {}", elided_pass(&x)); println!("`annotated_pass`: {}", annotated_pass(&x)); }
See also:
Traits
A trait
is a collection of methods defined for an unknown type: Self
. They can access other methods declared in the same trait.
Traits can be implemented for any data type. In the example below, we define Animal
, a group of methods. The Animal
trait
is then implemented for the Sheep
data type, allowing the use of methods from Animal
with a Sheep
:
struct Sheep { naked: bool, name: &'static str } trait Animal { // Associated function signature; `Self` refers to the implementor type. fn new(name: &'static str) -> Self; // Method signatures; these will return a string. fn name(&self) -> &'static str; fn noise(&self) -> &'static str; // Traits can provide default method definitions. fn talk(&self) { println!("{} says {}", self.name(), self.noise()); } } impl Sheep { fn is_naked(&self) -> bool { self.naked } fn shear(&mut self) { if self.is_naked() { // Implementor methods can use the implementor's trait methods. println!("{} is already naked...", self.name()); } else { println!("{} gets a haircut!", self.name); self.naked = true; } } } // Implement the `Animal` trait for `Sheep`. impl Animal for Sheep { // `Self` is the implementor type: `Sheep`. fn new(name: &'static str) -> Sheep { Sheep { name: name, naked: false } } fn name(&self) -> &'static str { self.name } fn noise(&self) -> &'static str { if self.is_naked() { "baaaaah?" } else { "baaaaah!" } } // Default trait methods can be overridden. fn talk(&self) { // For example, we can add some quiet contemplation. println!("{} pauses briefly... {}", self.name, self.noise()); } } fn main() { // Type annotation is necessary in this case. let mut dolly: Sheep = Animal::new("Dolly"); // TODO ^ Try removing the type annotations. dolly.talk(); dolly.shear(); dolly.talk(); }
Derive
The compiler is capable of providing basic implementations for some traits via the #[derive]
attribute. These traits can still be manually implemented if a more complex behavior is required.
The following is a list of derivable traits:
- Comparison traits:
Eq
,PartialEq
,Ord
,PartialOrd
. Clone
, to createT
from&T
via a copy.Copy
, to give a type 'copy semantics' instead of 'move semantics'.Hash
, to compute a hash from&T
.Default
, to create an empty instance of a data type.Debug
, to format a value using the{:?}
formatter.
// `Centimeters`, a tuple struct that can be compared #[derive(PartialEq, PartialOrd)] struct Centimeters(f64); // `Inches`, a tuple struct that can be printed #[derive(Debug)] struct Inches(i32); impl Inches { fn to_centimeters(&self) -> Centimeters { let &Inches(inches) = self; Centimeters(inches as f64 * 2.54) } } // `Seconds`, a tuple struct with no additional attributes struct Seconds(i32); fn main() { let _one_second = Seconds(1); // Error: `Seconds` can't be printed; it doesn't implement the `Debug` trait //println!("One second looks like: {:?}", _one_second); // TODO ^ Try uncommenting this line // Error: `Seconds` can't be compared; it doesn't implement the `PartialEq` trait //let _this_is_true = (_one_second == _one_second); // TODO ^ Try uncommenting this line let foot = Inches(12); println!("One foot equals {:?}", foot); let meter = Centimeters(100.0); let cmp = if foot.to_centimeters() < meter { "smaller" } else { "bigger" }; println!("One foot is {} than one meter.", cmp); }
See also:
Returning Traits with dyn
The Rust compiler needs to know how much space every function's return type requires. This means all your functions have to return a concrete type. Unlike other languages, if you have a trait like Animal
, you can't write a function that returns Animal
, because its different implementations will need different amounts of memory.
However, there's an easy workaround. Instead of returning a trait object directly, our functions return a Box
which contains some Animal
. A box
is just a reference to some memory in the heap. Because a reference has a statically-known size, and the compiler can guarantee it points to a heap-allocated Animal
, we can return a trait from our function!
Rust tries to be as explicit as possible whenever it allocates memory on the heap. So if your function returns a pointer-to-trait-on-heap in this way, you need to write the return type with the dyn
keyword, e.g. Box<dyn Animal>
.
struct Sheep {} struct Cow {} trait Animal { // Instance method signature fn noise(&self) -> &'static str; } // Implement the `Animal` trait for `Sheep`. impl Animal for Sheep { fn noise(&self) -> &'static str { "baaaaah!" } } // Implement the `Animal` trait for `Cow`. impl Animal for Cow { fn noise(&self) -> &'static str { "moooooo!" } } // Returns some struct that implements Animal, but we don't know which one at compile time. fn random_animal(random_number: f64) -> Box<dyn Animal> { if random_number < 0.5 { Box::new(Sheep {}) } else { Box::new(Cow {}) } } fn main() { let random_number = 0.234; let animal = random_animal(random_number); println!("You've randomly chosen an animal, and it says {}", animal.noise()); }
Operator Overloading
In Rust, many of the operators can be overloaded via traits. That is, some operators can be used to accomplish different tasks based on their input arguments. This is possible because operators are syntactic sugar for method calls. For example, the +
operator in a + b
calls the add
method (as in a.add(b)
). This add
method is part of the Add
trait. Hence, the +
operator can be used by any implementor of the Add
trait.
A list of the traits, such as Add
, that overload operators can be found in core::ops
.
use std::ops; struct Foo; struct Bar; #[derive(Debug)] struct FooBar; #[derive(Debug)] struct BarFoo; // The `std::ops::Add` trait is used to specify the functionality of `+`. // Here, we make `Add<Bar>` - the trait for addition with a RHS of type `Bar`. // The following block implements the operation: Foo + Bar = FooBar impl ops::Add<Bar> for Foo { type Output = FooBar; fn add(self, _rhs: Bar) -> FooBar { println!("> Foo.add(Bar) was called"); FooBar } } // By reversing the types, we end up implementing non-commutative addition. // Here, we make `Add<Foo>` - the trait for addition with a RHS of type `Foo`. // This block implements the operation: Bar + Foo = BarFoo impl ops::Add<Foo> for Bar { type Output = BarFoo; fn add(self, _rhs: Foo) -> BarFoo { println!("> Bar.add(Foo) was called"); BarFoo } } fn main() { println!("Foo + Bar = {:?}", Foo + Bar); println!("Bar + Foo = {:?}", Bar + Foo); }
See Also
Drop
The Drop
trait only has one method: drop
, which is called automatically when an object goes out of scope. The main use of the Drop
trait is to free the resources that the implementor instance owns.
Box
, Vec
, String
, File
, and Process
are some examples of types that implement the Drop
trait to free resources. The Drop
trait can also be manually implemented for any custom data type.
The following example adds a print to console to the drop
function to announce when it is called.
struct Droppable { name: &'static str, } // This trivial implementation of `drop` adds a print to console. impl Drop for Droppable { fn drop(&mut self) { println!("> Dropping {}", self.name); } } fn main() { let _a = Droppable { name: "a" }; // block A { let _b = Droppable { name: "b" }; // block B { let _c = Droppable { name: "c" }; let _d = Droppable { name: "d" }; println!("Exiting block B"); } println!("Just exited block B"); println!("Exiting block A"); } println!("Just exited block A"); // Variable can be manually dropped using the `drop` function drop(_a); // TODO ^ Try commenting this line println!("end of the main function"); // `_a` *won't* be `drop`ed again here, because it already has been // (manually) `drop`ed }
Iterators
The Iterator
trait is used to implement iterators over collections such as arrays.
The trait requires only a method to be defined for the next
element, which may be manually defined in an impl
block or automatically defined (as in arrays and ranges).
As a point of convenience for common situations, the for
construct turns some collections into iterators using the .into_iter()
method.
struct Fibonacci { curr: u32, next: u32, } // Implement `Iterator` for `Fibonacci`. // The `Iterator` trait only requires a method to be defined for the `next` element. impl Iterator for Fibonacci { // We can refer to this type using Self::Item type Item = u32; // Here, we define the sequence using `.curr` and `.next`. // The return type is `Option<T>`: // * When the `Iterator` is finished, `None` is returned. // * Otherwise, the next value is wrapped in `Some` and returned. // We use Self::Item in the return type, so we can change // the type without having to update the function signatures. fn next(&mut self) -> Option<Self::Item> { let current = self.curr; self.curr = self.next; self.next = current + self.next; // Since there's no endpoint to a Fibonacci sequence, the `Iterator` // will never return `None`, and `Some` is always returned. Some(current) } } // Returns a Fibonacci sequence generator fn fibonacci() -> Fibonacci { Fibonacci { curr: 0, next: 1 } } fn main() { // `0..3` is an `Iterator` that generates: 0, 1, and 2. let mut sequence = 0..3; println!("Four consecutive `next` calls on 0..3"); println!("> {:?}", sequence.next()); println!("> {:?}", sequence.next()); println!("> {:?}", sequence.next()); println!("> {:?}", sequence.next()); // `for` works through an `Iterator` until it returns `None`. // Each `Some` value is unwrapped and bound to a variable (here, `i`). println!("Iterate through 0..3 using `for`"); for i in 0..3 { println!("> {}", i); } // The `take(n)` method reduces an `Iterator` to its first `n` terms. println!("The first four terms of the Fibonacci sequence are: "); for i in fibonacci().take(4) { println!("> {}", i); } // The `skip(n)` method shortens an `Iterator` by dropping its first `n` terms. println!("The next four terms of the Fibonacci sequence are: "); for i in fibonacci().skip(4).take(4) { println!("> {}", i); } let array = [1u32, 3, 3, 7]; // The `iter` method produces an `Iterator` over an array/slice. println!("Iterate the following array {:?}", &array); for i in array.iter() { println!("> {}", i); } }
impl Trait
impl Trait
can be used in two locations:
- as an argument type
- as a return type
As an argument type
If your function is generic over a trait but you don't mind the specific type, you can simplify the function declaration using impl Trait
as the type of the argument.
For example, consider the following code:
fn parse_csv_document<R: std::io::BufRead>(src: R) -> std::io::Result<Vec<Vec<String>>> { src.lines() .map(|line| { // For each line in the source line.map(|line| { // If the line was read successfully, process it, if not, return the error line.split(',') // Split the line separated by commas .map(|entry| String::from(entry.trim())) // Remove leading and trailing whitespace .collect() // Collect all strings in a row into a Vec<String> }) }) .collect() // Collect all lines into a Vec<Vec<String>> }
parse_csv_document
is generic, allowing it to take any type which implements BufRead, such as BufReader<File>
or [u8]
, but it's not important what type R
is, and R
is only used to declare the type of src
, so the function can also be written as:
fn parse_csv_document(src: impl std::io::BufRead) -> std::io::Result<Vec<Vec<String>>> { src.lines() .map(|line| { // For each line in the source line.map(|line| { // If the line was read successfully, process it, if not, return the error line.split(',') // Split the line separated by commas .map(|entry| String::from(entry.trim())) // Remove leading and trailing whitespace .collect() // Collect all strings in a row into a Vec<String> }) }) .collect() // Collect all lines into a Vec<Vec<String>> }
Note that using impl Trait
as an argument type means that you cannot explicitly state what form of the function you use, i.e. parse_csv_document::<std::io::Empty>(std::io::empty())
will not work with the second example.
As a return type
If your function returns a type that implements MyTrait
, you can write its return type as -> impl MyTrait
. This can help simplify your type signatures quite a lot!
use std::iter; use std::vec::IntoIter; // This function combines two `Vec<i32>` and returns an iterator over it. // Look how complicated its return type is! fn combine_vecs_explicit_return_type( v: Vec<i32>, u: Vec<i32>, ) -> iter::Cycle<iter::Chain<IntoIter<i32>, IntoIter<i32>>> { v.into_iter().chain(u.into_iter()).cycle() } // This is the exact same function, but its return type uses `impl Trait`. // Look how much simpler it is! fn combine_vecs( v: Vec<i32>, u: Vec<i32>, ) -> impl Iterator<Item=i32> { v.into_iter().chain(u.into_iter()).cycle() } fn main() { let v1 = vec![1, 2, 3]; let v2 = vec![4, 5]; let mut v3 = combine_vecs(v1, v2); assert_eq!(Some(1), v3.next()); assert_eq!(Some(2), v3.next()); assert_eq!(Some(3), v3.next()); assert_eq!(Some(4), v3.next()); assert_eq!(Some(5), v3.next()); println!("all done"); }
More importantly, some Rust types can't be written out. For example, every closure has its own unnamed concrete type. Before impl Trait
syntax, you had to allocate on the heap in order to return a closure. But now you can do it all statically, like this:
// Returns a function that adds `y` to its input fn make_adder_function(y: i32) -> impl Fn(i32) -> i32 { let closure = move |x: i32| { x + y }; closure } fn main() { let plus_one = make_adder_function(1); assert_eq!(plus_one(2), 3); }
You can also use impl Trait
to return an iterator that uses map
or filter
closures! This makes using map
and filter
easier. Because closure types don't have names, you can't write out an explicit return type if your function returns iterators with closures. But with impl Trait
you can do this easily:
fn double_positives<'a>(numbers: &'a Vec<i32>) -> impl Iterator<Item = i32> + 'a { numbers .iter() .filter(|x| x > &&0) .map(|x| x * 2) } fn main() { let singles = vec![-3, -2, 2, 3]; let doubles = double_positives(&singles); assert_eq!(doubles.collect::<Vec<i32>>(), vec![4, 6]); }
Clone
When dealing with resources, the default behavior is to transfer them during assignments or function calls. However, sometimes we need to make a copy of the resource as well.
The Clone
trait helps us do exactly this. Most commonly, we can use the .clone()
method defined by the Clone
trait.
// A unit struct without resources #[derive(Debug, Clone, Copy)] struct Unit; // A tuple struct with resources that implements the `Clone` trait #[derive(Clone, Debug)] struct Pair(Box<i32>, Box<i32>); fn main() { // Instantiate `Unit` let unit = Unit; // Copy `Unit`, there are no resources to move let copied_unit = unit; // Both `Unit`s can be used independently println!("original: {:?}", unit); println!("copy: {:?}", copied_unit); // Instantiate `Pair` let pair = Pair(Box::new(1), Box::new(2)); println!("original: {:?}", pair); // Move `pair` into `moved_pair`, moves resources let moved_pair = pair; println!("moved: {:?}", moved_pair); // Error! `pair` has lost its resources //println!("original: {:?}", pair); // TODO ^ Try uncommenting this line // Clone `moved_pair` into `cloned_pair` (resources are included) let cloned_pair = moved_pair.clone(); // Drop the original pair using std::mem::drop drop(moved_pair); // Error! `moved_pair` has been dropped //println!("copy: {:?}", moved_pair); // TODO ^ Try uncommenting this line // The result from .clone() can still be used! println!("clone: {:?}", cloned_pair); }
Supertraits
Rust doesn't have "inheritance", but you can define a trait as being a superset of another trait. For example:
trait Person { fn name(&self) -> String; } // Person is a supertrait of Student. // Implementing Student requires you to also impl Person. trait Student: Person { fn university(&self) -> String; } trait Programmer { fn fav_language(&self) -> String; } // CompSciStudent (computer science student) is a subtrait of both Programmer // and Student. Implementing CompSciStudent requires you to impl both supertraits. trait CompSciStudent: Programmer + Student { fn git_username(&self) -> String; } fn comp_sci_student_greeting(student: &dyn CompSciStudent) -> String { format!( "My name is {} and I attend {}. My favorite language is {}. My Git username is {}", student.name(), student.university(), student.fav_language(), student.git_username() ) } fn main() {}
See also:
The Rust Programming Language chapter on supertraits
Disambiguating overlapping traits
A type can implement many different traits. What if two traits both require the same name? For example, many traits might have a method named get()
. They might even have different return types!
Good news: because each trait implementation gets its own impl
block, it's clear which trait's get
method you're implementing.
What about when it comes time to call those methods? To disambiguate between them, we have to use Fully Qualified Syntax.
trait UsernameWidget { // Get the selected username out of this widget fn get(&self) -> String; } trait AgeWidget { // Get the selected age out of this widget fn get(&self) -> u8; } // A form with both a UsernameWidget and an AgeWidget struct Form { username: String, age: u8, } impl UsernameWidget for Form { fn get(&self) -> String { self.username.clone() } } impl AgeWidget for Form { fn get(&self) -> u8 { self.age } } fn main() { let form = Form { username: "rustacean".to_owned(), age: 28, }; // If you uncomment this line, you'll get an error saying // "multiple `get` found". Because, after all, there are multiple methods // named `get`. // println!("{}", form.get()); let username = <Form as UsernameWidget>::get(&form); assert_eq!("rustacean".to_owned(), username); let age = <Form as AgeWidget>::get(&form); assert_eq!(28, age); }
See also:
The Rust Programming Language chapter on Fully Qualified syntax
macro_rules!
Rust provides a powerful macro system that allows metaprogramming. As you've seen in previous chapters, macros look like functions, except that their name ends with a bang !
, but instead of generating a function call, macros are expanded into source code that gets compiled with the rest of the program. However, unlike macros in C and other languages, Rust macros are expanded into abstract syntax trees, rather than string preprocessing, so you don't get unexpected precedence bugs.
Macros are created using the macro_rules!
macro:
// This is a simple macro named `say_hello`. macro_rules! say_hello { // `()` indicates that the macro takes no argument. () => { // The macro will expand into the contents of this block. println!("Hello!") }; } fn main() { // This call will expand into `println!("Hello")` say_hello!() }
So why are macros useful?
-
Don't repeat yourself. There are many cases where you may need similar functionality in multiple places but with different types. Often, writing a macro is a useful way to avoid repeating code. (More on this later)
-
Domain-specific languages. Macros allow you to define special syntax for a specific purpose. (More on this later)
-
Variadic interfaces. Sometimes you want to define an interface that takes a variable number of arguments. An example is
println!
which could take any number of arguments, depending on the format string. (More on this later)
Syntax
In following subsections, we will show how to define macros in Rust. There are three basic ideas:
Designators
The arguments of a macro are prefixed by a dollar sign $
and type annotated with a designator:
macro_rules! create_function { // This macro takes an argument of designator `ident` and // creates a function named `$func_name`. // The `ident` designator is used for variable/function names. ($func_name:ident) => { fn $func_name() { // The `stringify!` macro converts an `ident` into a string. println!("You called {:?}()", stringify!($func_name)); } }; } // Create functions named `foo` and `bar` with the above macro. create_function!(foo); create_function!(bar); macro_rules! print_result { // This macro takes an expression of type `expr` and prints // it as a string along with its result. // The `expr` designator is used for expressions. ($expression:expr) => { // `stringify!` will convert the expression *as it is* into a string. println!("{:?} = {:?}", stringify!($expression), $expression); }; } fn main() { foo(); bar(); print_result!(1u32 + 1); // Recall that blocks are expressions too! print_result!({ let x = 1u32; x * x + 2 * x - 1 }); }
These are some of the available designators:
block
expr
is used for expressionsident
is used for variable/function namesitem
literal
is used for literal constantspat
(pattern)path
stmt
(statement)tt
(token tree)ty
(type)vis
(visibility qualifier)
For a complete list, see the Rust Reference.
Overload
Macros can be overloaded to accept different combinations of arguments. In that regard, macro_rules!
can work similarly to a match block:
// `test!` will compare `$left` and `$right` // in different ways depending on how you invoke it: macro_rules! test { // Arguments don't need to be separated by a comma. // Any template can be used! ($left:expr; and $right:expr) => { println!("{:?} and {:?} is {:?}", stringify!($left), stringify!($right), $left && $right) }; // ^ each arm must end with a semicolon. ($left:expr; or $right:expr) => { println!("{:?} or {:?} is {:?}", stringify!($left), stringify!($right), $left || $right) }; } fn main() { test!(1i32 + 1 == 2i32; and 2i32 * 2 == 4i32); test!(true; or false); }
Repeat
Macros can use +
in the argument list to indicate that an argument may repeat at least once, or *
, to indicate that the argument may repeat zero or more times.
In the following example, surrounding the matcher with $(...),+
will match one or more expression, separated by commas. Also note that the semicolon is optional on the last case.
// `find_min!` will calculate the minimum of any number of arguments. macro_rules! find_min { // Base case: ($x:expr) => ($x); // `$x` followed by at least one `$y,` ($x:expr, $($y:expr),+) => ( // Call `find_min!` on the tail `$y` std::cmp::min($x, find_min!($($y),+)) ) } fn main() { println!("{}", find_min!(1)); println!("{}", find_min!(1 + 2, 2)); println!("{}", find_min!(5, 2 * 3, 4)); }
DRY (Don't Repeat Yourself)
Macros allow writing DRY code by factoring out the common parts of functions and/or test suites. Here is an example that implements and tests the +=
, *=
and -=
operators on Vec<T>
:
use std::ops::{Add, Mul, Sub}; macro_rules! assert_equal_len { // The `tt` (token tree) designator is used for // operators and tokens. ($a:expr, $b:expr, $func:ident, $op:tt) => { assert!($a.len() == $b.len(), "{:?}: dimension mismatch: {:?} {:?} {:?}", stringify!($func), ($a.len(),), stringify!($op), ($b.len(),)); }; } macro_rules! op { ($func:ident, $bound:ident, $op:tt, $method:ident) => { fn $func<T: $bound<T, Output=T> + Copy>(xs: &mut Vec<T>, ys: &Vec<T>) { assert_equal_len!(xs, ys, $func, $op); for (x, y) in xs.iter_mut().zip(ys.iter()) { *x = $bound::$method(*x, *y); // *x = x.$method(*y); } } }; } // Implement `add_assign`, `mul_assign`, and `sub_assign` functions. op!(add_assign, Add, +=, add); op!(mul_assign, Mul, *=, mul); op!(sub_assign, Sub, -=, sub); mod test { use std::iter; macro_rules! test { ($func:ident, $x:expr, $y:expr, $z:expr) => { #[test] fn $func() { for size in 0usize..10 { let mut x: Vec<_> = iter::repeat($x).take(size).collect(); let y: Vec<_> = iter::repeat($y).take(size).collect(); let z: Vec<_> = iter::repeat($z).take(size).collect(); super::$func(&mut x, &y); assert_eq!(x, z); } } }; } // Test `add_assign`, `mul_assign`, and `sub_assign`. test!(add_assign, 1u32, 2u32, 3u32); test!(mul_assign, 2u32, 3u32, 6u32); test!(sub_assign, 3u32, 2u32, 1u32); }
$ rustc --test dry.rs && ./dry
running 3 tests
test test::mul_assign ... ok
test test::add_assign ... ok
test test::sub_assign ... ok
test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured
Domain Specific Languages (DSLs)
A DSL is a mini "language" embedded in a Rust macro. It is completely valid Rust because the macro system expands into normal Rust constructs, but it looks like a small language. This allows you to define concise or intuitive syntax for some special functionality (within bounds).
Suppose that I want to define a little calculator API. I would like to supply an expression and have the output printed to console.
macro_rules! calculate { (eval $e:expr) => { { let val: usize = $e; // Force types to be integers println!("{} = {}", stringify!{$e}, val); } }; } fn main() { calculate! { eval 1 + 2 // hehehe `eval` is _not_ a Rust keyword! } calculate! { eval (1 + 2) * (3 / 4) } }
Output:
1 + 2 = 3
(1 + 2) * (3 / 4) = 0
This was a very simple example, but much more complex interfaces have been developed, such as lazy_static
or clap
.
Also, note the two pairs of braces in the macro. The outer ones are part of the syntax of macro_rules!
, in addition to ()
or []
.
Variadic Interfaces
A variadic interface takes an arbitrary number of arguments. For example, println!
can take an arbitrary number of arguments, as determined by the format string.
We can extend our calculate!
macro from the previous section to be variadic:
macro_rules! calculate { // The pattern for a single `eval` (eval $e:expr) => { { let val: usize = $e; // Force types to be integers println!("{} = {}", stringify!{$e}, val); } }; // Decompose multiple `eval`s recursively (eval $e:expr, $(eval $es:expr),+) => {{ calculate! { eval $e } calculate! { $(eval $es),+ } }}; } fn main() { calculate! { // Look ma! Variadic `calculate!`! eval 1 + 2, eval 3 + 4, eval (2 * 3) + 1 } }
Output:
1 + 2 = 3
3 + 4 = 7
(2 * 3) + 1 = 7
Error handling
Error handling is the process of handling the possibility of failure. For example, failing to read a file and then continuing to use that bad input would clearly be problematic. Noticing and explicitly managing those errors saves the rest of the program from various pitfalls.
There are various ways to deal with errors in Rust, which are described in the following subchapters. They all have more or less subtle differences and different use cases. As a rule of thumb:
An explicit panic
is mainly useful for tests and dealing with unrecoverable errors. For prototyping it can be useful, for example when dealing with functions that haven't been implemented yet, but in those cases the more descriptive unimplemented
is better. In tests panic
is a reasonable way to explicitly fail.
The Option
type is for when a value is optional or when the lack of a value is not an error condition. For example the parent of a directory - /
and C:
don't have one. When dealing with Option
s, unwrap
is fine for prototyping and cases where it's absolutely certain that there is guaranteed to be a value. However expect
is more useful since it lets you specify an error message in case something goes wrong anyway.
When there is a chance that things do go wrong and the caller has to deal with the problem, use Result
. You can unwrap
and expect
them as well (please don't do that unless it's a test or quick prototype).
For a more rigorous discussion of error handling, refer to the error handling section in the official book.
panic
The simplest error handling mechanism we will see is panic
. It prints an error message, starts unwinding the stack, and usually exits the program. Here, we explicitly call panic
on our error condition:
fn drink(beverage: &str) { // You shouldn't drink too much sugary beverages. if beverage == "lemonade" { panic!("AAAaaaaa!!!!"); } println!("Some refreshing {} is all I need.", beverage); } fn main() { drink("water"); drink("lemonade"); drink("still water"); }
The first call to drink
works. The second panics and thus the third is never called.
abort
and unwind
The previous section illustrates the error handling mechanism panic
. Different code paths can be conditionally compiled based on the panic setting. The current values available are unwind
and abort
.
Building on the prior lemonade example, we explicitly use the panic strategy to exercise different lines of code.
fn drink(beverage: &str) { // You shouldn't drink too much sugary beverages. if beverage == "lemonade" { if cfg!(panic="abort"){ println!("This is not your party. Run!!!!");} else{ println!("Spit it out!!!!");} } else{ println!("Some refreshing {} is all I need.", beverage); } } fn main() { drink("water"); drink("lemonade"); }
Here is another example focusing on rewriting drink()
and explicitly use the unwind
keyword.
#[cfg(panic = "unwind")] fn ah(){ println!("Spit it out!!!!");} #[cfg(not(panic="unwind"))] fn ah(){ println!("This is not your party. Run!!!!");} fn drink(beverage: &str){ if beverage == "lemonade"{ ah();} else{println!("Some refreshing {} is all I need.", beverage);} } fn main() { drink("water"); drink("lemonade"); }
The panic strategy can be set from the command line by using abort
or unwind
.
rustc lemonade.rs -C panic=abort
Option
& unwrap
In the last example, we showed that we can induce program failure at will. We told our program to panic
if we drink a sugary lemonade. But what if we expect some drink but don't receive one? That case would be just as bad, so it needs to be handled!
We could test this against the null string (""
) as we do with a lemonade. Since we're using Rust, let's instead have the compiler point out cases where there's no drink.
An enum
called Option<T>
in the std
library is used when absence is a possibility. It manifests itself as one of two "options":
Some(T)
: An element of typeT
was foundNone
: No element was found
These cases can either be explicitly handled via match
or implicitly with unwrap
. Implicit handling will either return the inner element or panic
.
Note that it's possible to manually customize panic
with expect, but unwrap
otherwise leaves us with a less meaningful output than explicit handling. In the following example, explicit handling yields a more controlled result while retaining the option to panic
if desired.
// The adult has seen it all, and can handle any drink well. // All drinks are handled explicitly using `match`. fn give_adult(drink: Option<&str>) { // Specify a course of action for each case. match drink { Some("lemonade") => println!("Yuck! Too sugary."), Some(inner) => println!("{}? How nice.", inner), None => println!("No drink? Oh well."), } } // Others will `panic` before drinking sugary drinks. // All drinks are handled implicitly using `unwrap`. fn drink(drink: Option<&str>) { // `unwrap` returns a `panic` when it receives a `None`. let inside = drink.unwrap(); if inside == "lemonade" { panic!("AAAaaaaa!!!!"); } println!("I love {}s!!!!!", inside); } fn main() { let water = Some("water"); let lemonade = Some("lemonade"); let void = None; give_adult(water); give_adult(lemonade); give_adult(void); let coffee = Some("coffee"); let nothing = None; drink(coffee); drink(nothing); }
Unpacking options with ?
You can unpack Option
s by using match
statements, but it's often easier to use the ?
operator. If x
is an Option
, then evaluating x?
will return the underlying value if x
is Some
, otherwise it will terminate whatever function is being executed and return None
.
fn next_birthday(current_age: Option<u8>) -> Option<String> { // If `current_age` is `None`, this returns `None`. // If `current_age` is `Some`, the inner `u8` gets assigned to `next_age` let next_age: u8 = current_age? + 1; Some(format!("Next year I will be {}", next_age)) }
You can chain many ?
s together to make your code much more readable.
struct Person { job: Option<Job>, } #[derive(Clone, Copy)] struct Job { phone_number: Option<PhoneNumber>, } #[derive(Clone, Copy)] struct PhoneNumber { area_code: Option<u8>, number: u32, } impl Person { // Gets the area code of the phone number of the person's job, if it exists. fn work_phone_area_code(&self) -> Option<u8> { // This would need many nested `match` statements without the `?` operator. // It would take a lot more code - try writing it yourself and see which // is easier. self.job?.phone_number?.area_code } } fn main() { let p = Person { job: Some(Job { phone_number: Some(PhoneNumber { area_code: Some(61), number: 439222222, }), }), }; assert_eq!(p.work_phone_area_code(), Some(61)); }
Combinators: map
match
is a valid method for handling Option
s. However, you may eventually find heavy usage tedious, especially with operations only valid with an input. In these cases, combinators can be used to manage control flow in a modular fashion.
Option
has a built in method called map()
, a combinator for the simple mapping of Some -> Some
and None -> None
. Multiple map()
calls can be chained together for even more flexibility.
In the following example, process()
replaces all functions previous to it while staying compact.
#![allow(dead_code)] #[derive(Debug)] enum Food { Apple, Carrot, Potato } #[derive(Debug)] struct Peeled(Food); #[derive(Debug)] struct Chopped(Food); #[derive(Debug)] struct Cooked(Food); // Peeling food. If there isn't any, then return `None`. // Otherwise, return the peeled food. fn peel(food: Option<Food>) -> Option<Peeled> { match food { Some(food) => Some(Peeled(food)), None => None, } } // Chopping food. If there isn't any, then return `None`. // Otherwise, return the chopped food. fn chop(peeled: Option<Peeled>) -> Option<Chopped> { match peeled { Some(Peeled(food)) => Some(Chopped(food)), None => None, } } // Cooking food. Here, we showcase `map()` instead of `match` for case handling. fn cook(chopped: Option<Chopped>) -> Option<Cooked> { chopped.map(|Chopped(food)| Cooked(food)) } // A function to peel, chop, and cook food all in sequence. // We chain multiple uses of `map()` to simplify the code. fn process(food: Option<Food>) -> Option<Cooked> { food.map(|f| Peeled(f)) .map(|Peeled(f)| Chopped(f)) .map(|Chopped(f)| Cooked(f)) } // Check whether there's food or not before trying to eat it! fn eat(food: Option<Cooked>) { match food { Some(food) => println!("Mmm. I love {:?}", food), None => println!("Oh no! It wasn't edible."), } } fn main() { let apple = Some(Food::Apple); let carrot = Some(Food::Carrot); let potato = None; let cooked_apple = cook(chop(peel(apple))); let cooked_carrot = cook(chop(peel(carrot))); // Let's try the simpler looking `process()` now. let cooked_potato = process(potato); eat(cooked_apple); eat(cooked_carrot); eat(cooked_potato); }
See also:
closures, Option
, Option::map()
Combinators: and_then
map()
was described as a chainable way to simplify match
statements. However, using map()
on a function that returns an Option<T>
results in the nested Option<Option<T>>
. Chaining multiple calls together can then become confusing. That's where another combinator called and_then()
, known in some languages as flatmap, comes in.
and_then()
calls its function input with the wrapped value and returns the result. If the Option
is None
, then it returns None
instead.
In the following example, cookable_v2()
results in an Option<Food>
. Using map()
instead of and_then()
would have given an Option<Option<Food>>
, which is an invalid type for eat()
.
#![allow(dead_code)] #[derive(Debug)] enum Food { CordonBleu, Steak, Sushi } #[derive(Debug)] enum Day { Monday, Tuesday, Wednesday } // We don't have the ingredients to make Sushi. fn have_ingredients(food: Food) -> Option<Food> { match food { Food::Sushi => None, _ => Some(food), } } // We have the recipe for everything except Cordon Bleu. fn have_recipe(food: Food) -> Option<Food> { match food { Food::CordonBleu => None, _ => Some(food), } } // To make a dish, we need both the recipe and the ingredients. // We can represent the logic with a chain of `match`es: fn cookable_v1(food: Food) -> Option<Food> { match have_recipe(food) { None => None, Some(food) => have_ingredients(food), } } // This can conveniently be rewritten more compactly with `and_then()`: fn cookable_v2(food: Food) -> Option<Food> { have_recipe(food).and_then(have_ingredients) } fn eat(food: Food, day: Day) { match cookable_v2(food) { Some(food) => println!("Yay! On {:?} we get to eat {:?}.", day, food), None => println!("Oh no. We don't get to eat on {:?}?", day), } } fn main() { let (cordon_bleu, steak, sushi) = (Food::CordonBleu, Food::Steak, Food::Sushi); eat(cordon_bleu, Day::Monday); eat(steak, Day::Tuesday); eat(sushi, Day::Wednesday); }
See also:
closures, Option
, and Option::and_then()
Unpacking options and defaults
There is more than one way to unpack an Option
and fall back on a default if it is None
. To choose the one that meets our needs, we need to consider the following:
- do we need eager or lazy evaluation?
- do we need to keep the original empty value intact, or modify it in place?
or()
is chainable, evaluates eagerly, keeps empty value intact
or()
is chainable and eagerly evaluates its argument, as is shown in the following example. Note that because or
's arguments are evaluated eagerly, the variable passed to or
is moved.
#[derive(Debug)] enum Fruit { Apple, Orange, Banana, Kiwi, Lemon } fn main() { let apple = Some(Fruit::Apple); let orange = Some(Fruit::Orange); let no_fruit: Option<Fruit> = None; let first_available_fruit = no_fruit.or(orange).or(apple); println!("first_available_fruit: {:?}", first_available_fruit); // first_available_fruit: Some(Orange) // `or` moves its argument. // In the example above, `or(orange)` returned a `Some`, so `or(apple)` was not invoked. // But the variable named `apple` has been moved regardless, and cannot be used anymore. // println!("Variable apple was moved, so this line won't compile: {:?}", apple); // TODO: uncomment the line above to see the compiler error }
or_else()
is chainable, evaluates lazily, keeps empty value intact
Another alternative is to use or_else
, which is also chainable, and evaluates lazily, as is shown in the following example:
#[derive(Debug)] enum Fruit { Apple, Orange, Banana, Kiwi, Lemon } fn main() { let apple = Some(Fruit::Apple); let no_fruit: Option<Fruit> = None; let get_kiwi_as_fallback = || { println!("Providing kiwi as fallback"); Some(Fruit::Kiwi) }; let get_lemon_as_fallback = || { println!("Providing lemon as fallback"); Some(Fruit::Lemon) }; let first_available_fruit = no_fruit .or_else(get_kiwi_as_fallback) .or_else(get_lemon_as_fallback); println!("first_available_fruit: {:?}", first_available_fruit); // Providing kiwi as fallback // first_available_fruit: Some(Kiwi) }
get_or_insert()
evaluates eagerly, modifies empty value in place
To make sure that an Option
contains a value, we can use get_or_insert
to modify it in place with a fallback value, as is shown in the following example. Note that get_or_insert
eagerly evaluates its parameter, so variable apple
is moved:
#[derive(Debug)] enum Fruit { Apple, Orange, Banana, Kiwi, Lemon } fn main() { let mut my_fruit: Option<Fruit> = None; let apple = Fruit::Apple; let first_available_fruit = my_fruit.get_or_insert(apple); println!("first_available_fruit is: {:?}", first_available_fruit); println!("my_fruit is: {:?}", my_fruit); // first_available_fruit is: Apple // my_fruit is: Some(Apple) //println!("Variable named `apple` is moved: {:?}", apple); // TODO: uncomment the line above to see the compiler error }
get_or_insert_with()
evaluates lazily, modifies empty value in place
Instead of explicitly providing a value to fall back on, we can pass a closure to get_or_insert_with
, as follows:
#[derive(Debug)] enum Fruit { Apple, Orange, Banana, Kiwi, Lemon } fn main() { let mut my_fruit: Option<Fruit> = None; let get_lemon_as_fallback = || { println!("Providing lemon as fallback"); Fruit::Lemon }; let first_available_fruit = my_fruit .get_or_insert_with(get_lemon_as_fallback); println!("first_available_fruit is: {:?}", first_available_fruit); println!("my_fruit is: {:?}", my_fruit); // Providing lemon as fallback // first_available_fruit is: Lemon // my_fruit is: Some(Lemon) // If the Option has a value, it is left unchanged, and the closure is not invoked let mut my_apple = Some(Fruit::Apple); let should_be_apple = my_apple.get_or_insert_with(get_lemon_as_fallback); println!("should_be_apple is: {:?}", should_be_apple); println!("my_apple is unchanged: {:?}", my_apple); // The output is a follows. Note that the closure `get_lemon_as_fallback` is not invoked // should_be_apple is: Apple // my_apple is unchanged: Some(Apple) }
See also:
closures
, get_or_insert
, get_or_insert_with
, ,moved variables
, or
, or_else
Result
Result
is a richer version of the Option
type that describes possible error instead of possible absence.
That is, Result<T, E>
could have one of two outcomes:
Ok(T)
: An elementT
was foundErr(E)
: An error was found with elementE
By convention, the expected outcome is Ok
while the unexpected outcome is Err
.
Like Option
, Result
has many methods associated with it. unwrap()
, for example, either yields the element T
or panic
s. For case handling, there are many combinators between Result
and Option
that overlap.
In working with Rust, you will likely encounter methods that return the Result
type, such as the parse()
method. It might not always be possible to parse a string into the other type, so parse()
returns a Result
indicating possible failure.
Let's see what happens when we successfully and unsuccessfully parse()
a string:
fn multiply(first_number_str: &str, second_number_str: &str) -> i32 { // Let's try using `unwrap()` to get the number out. Will it bite us? let first_number = first_number_str.parse::<i32>().unwrap(); let second_number = second_number_str.parse::<i32>().unwrap(); first_number * second_number } fn main() { let twenty = multiply("10", "2"); println!("double is {}", twenty); let tt = multiply("t", "2"); println!("double is {}", tt); }
In the unsuccessful case, parse()
leaves us with an error for unwrap()
to panic
on. Additionally, the panic
exits our program and provides an unpleasant error message.
To improve the quality of our error message, we should be more specific about the return type and consider explicitly handling the error.
Using Result
in main
The Result
type can also be the return type of the main
function if specified explicitly. Typically the main
function will be of the form:
fn main() { println!("Hello World!"); }
However main
is also able to have a return type of Result
. If an error occurs within the main
function it will return an error code and print a debug representation of the error (using the Debug
trait). The following example shows such a scenario and touches on aspects covered in the following section.
use std::num::ParseIntError; fn main() -> Result<(), ParseIntError> { let number_str = "10"; let number = match number_str.parse::<i32>() { Ok(number) => number, Err(e) => return Err(e), }; println!("{}", number); Ok(()) }
map
for Result
Panicking in the previous example's multiply
does not make for robust code. Generally, we want to return the error to the caller so it can decide what is the right way to respond to errors.
We first need to know what kind of error type we are dealing with. To determine the Err
type, we look to parse()
, which is implemented with the FromStr
trait for i32
. As a result, the Err
type is specified as ParseIntError
.
In the example below, the straightforward match
statement leads to code that is overall more cumbersome.
use std::num::ParseIntError; // With the return type rewritten, we use pattern matching without `unwrap()`. fn multiply(first_number_str: &str, second_number_str: &str) -> Result<i32, ParseIntError> { match first_number_str.parse::<i32>() { Ok(first_number) => { match second_number_str.parse::<i32>() { Ok(second_number) => { Ok(first_number * second_number) }, Err(e) => Err(e), } }, Err(e) => Err(e), } } fn print(result: Result<i32, ParseIntError>) { match result { Ok(n) => println!("n is {}", n), Err(e) => println!("Error: {}", e), } } fn main() { // This still presents a reasonable answer. let twenty = multiply("10", "2"); print(twenty); // The following now provides a much more helpful error message. let tt = multiply("t", "2"); print(tt); }
Luckily, Option
's map
, and_then
, and many other combinators are also implemented for Result
. Result
contains a complete listing.
use std::num::ParseIntError; // As with `Option`, we can use combinators such as `map()`. // This function is otherwise identical to the one above and reads: // Multiply if both values can be parsed from str, otherwise pass on the error. fn multiply(first_number_str: &str, second_number_str: &str) -> Result<i32, ParseIntError> { first_number_str.parse::<i32>().and_then(|first_number| { second_number_str.parse::<i32>().map(|second_number| first_number * second_number) }) } fn print(result: Result<i32, ParseIntError>) { match result { Ok(n) => println!("n is {}", n), Err(e) => println!("Error: {}", e), } } fn main() { // This still presents a reasonable answer. let twenty = multiply("10", "2"); print(twenty); // The following now provides a much more helpful error message. let tt = multiply("t", "2"); print(tt); }
aliases for Result
How about when we want to reuse a specific Result
type many times? Recall that Rust allows us to create aliases. Conveniently, we can define one for the specific Result
in question.
At a module level, creating aliases can be particularly helpful. Errors found in a specific module often have the same Err
type, so a single alias can succinctly define all associated Results
. This is so useful that the std
library even supplies one: io::Result
!
Here's a quick example to show off the syntax:
use std::num::ParseIntError; // Define a generic alias for a `Result` with the error type `ParseIntError`. type AliasedResult<T> = Result<T, ParseIntError>; // Use the above alias to refer to our specific `Result` type. fn multiply(first_number_str: &str, second_number_str: &str) -> AliasedResult<i32> { first_number_str.parse::<i32>().and_then(|first_number| { second_number_str.parse::<i32>().map(|second_number| first_number * second_number) }) } // Here, the alias again allows us to save some space. fn print(result: AliasedResult<i32>) { match result { Ok(n) => println!("n is {}", n), Err(e) => println!("Error: {}", e), } } fn main() { print(multiply("10", "2")); print(multiply("t", "2")); }
See also:
Early returns
In the previous example, we explicitly handled the errors using combinators. Another way to deal with this case analysis is to use a combination of match
statements and early returns.
That is, we can simply stop executing the function and return the error if one occurs. For some, this form of code can be easier to both read and write. Consider this version of the previous example, rewritten using early returns:
use std::num::ParseIntError; fn multiply(first_number_str: &str, second_number_str: &str) -> Result<i32, ParseIntError> { let first_number = match first_number_str.parse::<i32>() { Ok(first_number) => first_number, Err(e) => return Err(e), }; let second_number = match second_number_str.parse::<i32>() { Ok(second_number) => second_number, Err(e) => return Err(e), }; Ok(first_number * second_number) } fn print(result: Result<i32, ParseIntError>) { match result { Ok(n) => println!("n is {}", n), Err(e) => println!("Error: {}", e), } } fn main() { print(multiply("10", "2")); print(multiply("t", "2")); }
At this point, we've learned to explicitly handle errors using combinators and early returns. While we generally want to avoid panicking, explicitly handling all of our errors is cumbersome.
In the next section, we'll introduce ?
for the cases where we simply need to unwrap
without possibly inducing panic
.
Introducing ?
Sometimes we just want the simplicity of unwrap
without the possibility of a panic
. Until now, unwrap
has forced us to nest deeper and deeper when what we really wanted was to get the variable out. This is exactly the purpose of ?
.
Upon finding an Err
, there are two valid actions to take:
panic!
which we already decided to try to avoid if possiblereturn
because anErr
means it cannot be handled
?
is almost1 exactly equivalent to an unwrap
which return
s instead of panic
king on Err
s. Let's see how we can simplify the earlier example that used combinators:
use std::num::ParseIntError; fn multiply(first_number_str: &str, second_number_str: &str) -> Result<i32, ParseIntError> { let first_number = first_number_str.parse::<i32>()?; let second_number = second_number_str.parse::<i32>()?; Ok(first_number * second_number) } fn print(result: Result<i32, ParseIntError>) { match result { Ok(n) => println!("n is {}", n), Err(e) => println!("Error: {}", e), } } fn main() { print(multiply("10", "2")); print(multiply("t", "2")); }
The try!
macro
Before there was ?
, the same functionality was achieved with the try!
macro. The ?
operator is now recommended, but you may still find try!
when looking at older code. The same multiply
function from the previous example would look like this using try!
:
// To compile and run this example without errors, while using Cargo, change the value // of the `edition` field, in the `[package]` section of the `Cargo.toml` file, to "2015". use std::num::ParseIntError; fn multiply(first_number_str: &str, second_number_str: &str) -> Result<i32, ParseIntError> { let first_number = try!(first_number_str.parse::<i32>()); let second_number = try!(second_number_str.parse::<i32>()); Ok(first_number * second_number) } fn print(result: Result<i32, ParseIntError>) { match result { Ok(n) => println!("n is {}", n), Err(e) => println!("Error: {}", e), } } fn main() { print(multiply("10", "2")); print(multiply("t", "2")); }
See re-enter ? for more details.
Multiple error types
The previous examples have always been very convenient; Result
s interact with other Result
s and Option
s interact with other Option
s.
Sometimes an Option
needs to interact with a Result
, or a Result<T, Error1>
needs to interact with a Result<T, Error2>
. In those cases, we want to manage our different error types in a way that makes them composable and easy to interact with.
In the following code, two instances of unwrap
generate different error types. Vec::first
returns an Option
, while parse::<i32>
returns a Result<i32, ParseIntError>
:
fn double_first(vec: Vec<&str>) -> i32 { let first = vec.first().unwrap(); // Generate error 1 2 * first.parse::<i32>().unwrap() // Generate error 2 } fn main() { let numbers = vec!["42", "93", "18"]; let empty = vec![]; let strings = vec!["tofu", "93", "18"]; println!("The first doubled is {}", double_first(numbers)); println!("The first doubled is {}", double_first(empty)); // Error 1: the input vector is empty println!("The first doubled is {}", double_first(strings)); // Error 2: the element doesn't parse to a number }
Over the next sections, we'll see several strategies for handling these kind of problems.
Pulling Result
s out of Option
s
The most basic way of handling mixed error types is to just embed them in each other.
use std::num::ParseIntError; fn double_first(vec: Vec<&str>) -> Option<Result<i32, ParseIntError>> { vec.first().map(|first| { first.parse::<i32>().map(|n| 2 * n) }) } fn main() { let numbers = vec!["42", "93", "18"]; let empty = vec![]; let strings = vec!["tofu", "93", "18"]; println!("The first doubled is {:?}", double_first(numbers)); println!("The first doubled is {:?}", double_first(empty)); // Error 1: the input vector is empty println!("The first doubled is {:?}", double_first(strings)); // Error 2: the element doesn't parse to a number }
There are times when we'll want to stop processing on errors (like with ?
) but keep going when the Option
is None
. A couple of combinators come in handy to swap the Result
and Option
.
use std::num::ParseIntError; fn double_first(vec: Vec<&str>) -> Result<Option<i32>, ParseIntError> { let opt = vec.first().map(|first| { first.parse::<i32>().map(|n| 2 * n) }); opt.map_or(Ok(None), |r| r.map(Some)) } fn main() { let numbers = vec!["42", "93", "18"]; let empty = vec![]; let strings = vec!["tofu", "93", "18"]; println!("The first doubled is {:?}", double_first(numbers)); println!("The first doubled is {:?}", double_first(empty)); println!("The first doubled is {:?}", double_first(strings)); }
Defining an error type
Sometimes it simplifies the code to mask all of the different errors with a single type of error. We'll show this with a custom error.
Rust allows us to define our own error types. In general, a "good" error type:
- Represents different errors with the same type
- Presents nice error messages to the user
- Is easy to compare with other types
- Good:
Err(EmptyVec)
- Bad:
Err("Please use a vector with at least one element".to_owned())
- Good:
- Can hold information about the error
- Good:
Err(BadChar(c, position))
- Bad:
Err("+ cannot be used here".to_owned())
- Good:
- Composes well with other errors
use std::fmt; type Result<T> = std::result::Result<T, DoubleError>; // Define our error types. These may be customized for our error handling cases. // Now we will be able to write our own errors, defer to an underlying error // implementation, or do something in between. #[derive(Debug, Clone)] struct DoubleError; // Generation of an error is completely separate from how it is displayed. // There's no need to be concerned about cluttering complex logic with the display style. // // Note that we don't store any extra info about the errors. This means we can't state // which string failed to parse without modifying our types to carry that information. impl fmt::Display for DoubleError { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { write!(f, "invalid first item to double") } } fn double_first(vec: Vec<&str>) -> Result<i32> { vec.first() // Change the error to our new type. .ok_or(DoubleError) .and_then(|s| { s.parse::<i32>() // Update to the new error type here also. .map_err(|_| DoubleError) .map(|i| 2 * i) }) } fn print(result: Result<i32>) { match result { Ok(n) => println!("The first doubled is {}", n), Err(e) => println!("Error: {}", e), } } fn main() { let numbers = vec!["42", "93", "18"]; let empty = vec![]; let strings = vec!["tofu", "93", "18"]; print(double_first(numbers)); print(double_first(empty)); print(double_first(strings)); }
Box
ing errors
A way to write simple code while preserving the original errors is to Box
them. The drawback is that the underlying error type is only known at runtime and not statically determined.
The stdlib helps in boxing our errors by having Box
implement conversion from any type that implements the Error
trait into the trait object Box<Error>
, via From
.
use std::error; use std::fmt; // Change the alias to `Box<error::Error>`. type Result<T> = std::result::Result<T, Box<dyn error::Error>>; #[derive(Debug, Clone)] struct EmptyVec; impl fmt::Display for EmptyVec { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { write!(f, "invalid first item to double") } } impl error::Error for EmptyVec {} fn double_first(vec: Vec<&str>) -> Result<i32> { vec.first() .ok_or_else(|| EmptyVec.into()) // Converts to Box .and_then(|s| { s.parse::<i32>() .map_err(|e| e.into()) // Converts to Box .map(|i| 2 * i) }) } fn print(result: Result<i32>) { match result { Ok(n) => println!("The first doubled is {}", n), Err(e) => println!("Error: {}", e), } } fn main() { let numbers = vec!["42", "93", "18"]; let empty = vec![]; let strings = vec!["tofu", "93", "18"]; print(double_first(numbers)); print(double_first(empty)); print(double_first(strings)); }
See also:
Dynamic dispatch and Error
trait
Other uses of ?
Notice in the previous example that our immediate reaction to calling parse
is to map
the error from a library error into a boxed error:
.and_then(|s| s.parse::<i32>())
.map_err(|e| e.into())
Since this is a simple and common operation, it would be convenient if it could be elided. Alas, because and_then
is not sufficiently flexible, it cannot. However, we can instead use ?
.
?
was previously explained as either unwrap
or return Err(err)
. This is only mostly true. It actually means unwrap
or return Err(From::from(err))
. Since From::from
is a conversion utility between different types, this means that if you ?
where the error is convertible to the return type, it will convert automatically.
Here, we rewrite the previous example using ?
. As a result, the map_err
will go away when From::from
is implemented for our error type:
use std::error; use std::fmt; // Change the alias to `Box<dyn error::Error>`. type Result<T> = std::result::Result<T, Box<dyn error::Error>>; #[derive(Debug)] struct EmptyVec; impl fmt::Display for EmptyVec { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { write!(f, "invalid first item to double") } } impl error::Error for EmptyVec {} // The same structure as before but rather than chain all `Results` // and `Options` along, we `?` to get the inner value out immediately. fn double_first(vec: Vec<&str>) -> Result<i32> { let first = vec.first().ok_or(EmptyVec)?; let parsed = first.parse::<i32>()?; Ok(2 * parsed) } fn print(result: Result<i32>) { match result { Ok(n) => println!("The first doubled is {}", n), Err(e) => println!("Error: {}", e), } } fn main() { let numbers = vec!["42", "93", "18"]; let empty = vec![]; let strings = vec!["tofu", "93", "18"]; print(double_first(numbers)); print(double_first(empty)); print(double_first(strings)); }
This is actually fairly clean now. Compared with the original panic
, it is very similar to replacing the unwrap
calls with ?
except that the return types are Result
. As a result, they must be destructured at the top level.
See also:
From::from
and ?
Wrapping errors
An alternative to boxing errors is to wrap them in your own error type.
use std::error; use std::error::Error; use std::num::ParseIntError; use std::fmt; type Result<T> = std::result::Result<T, DoubleError>; #[derive(Debug)] enum DoubleError { EmptyVec, // We will defer to the parse error implementation for their error. // Supplying extra info requires adding more data to the type. Parse(ParseIntError), } impl fmt::Display for DoubleError { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { match *self { DoubleError::EmptyVec => write!(f, "please use a vector with at least one element"), // The wrapped error contains additional information and is available // via the source() method. DoubleError::Parse(..) => write!(f, "the provided string could not be parsed as int"), } } } impl error::Error for DoubleError { fn source(&self) -> Option<&(dyn error::Error + 'static)> { match *self { DoubleError::EmptyVec => None, // The cause is the underlying implementation error type. Is implicitly // cast to the trait object `&error::Error`. This works because the // underlying type already implements the `Error` trait. DoubleError::Parse(ref e) => Some(e), } } } // Implement the conversion from `ParseIntError` to `DoubleError`. // This will be automatically called by `?` if a `ParseIntError` // needs to be converted into a `DoubleError`. impl From<ParseIntError> for DoubleError { fn from(err: ParseIntError) -> DoubleError { DoubleError::Parse(err) } } fn double_first(vec: Vec<&str>) -> Result<i32> { let first = vec.first().ok_or(DoubleError::EmptyVec)?; // Here we implicitly use the `ParseIntError` implementation of `From` (which // we defined above) in order to create a `DoubleError`. let parsed = first.parse::<i32>()?; Ok(2 * parsed) } fn print(result: Result<i32>) { match result { Ok(n) => println!("The first doubled is {}", n), Err(e) => { println!("Error: {}", e); if let Some(source) = e.source() { println!(" Caused by: {}", source); } }, } } fn main() { let numbers = vec!["42", "93", "18"]; let empty = vec![]; let strings = vec!["tofu", "93", "18"]; print(double_first(numbers)); print(double_first(empty)); print(double_first(strings)); }
This adds a bit more boilerplate for handling errors and might not be needed in all applications. There are some libraries that can take care of the boilerplate for you.
See also:
From::from
and Enums
Iterating over Result
s
An Iter::map
operation might fail, for example:
fn main() { let strings = vec!["tofu", "93", "18"]; let numbers: Vec<_> = strings .into_iter() .map(|s| s.parse::<i32>()) .collect(); println!("Results: {:?}", numbers); }
Let's step through strategies for handling this.
Ignore the failed items with filter_map()
filter_map
calls a function and filters out the results that are None
.
fn main() { let strings = vec!["tofu", "93", "18"]; let numbers: Vec<_> = strings .into_iter() .filter_map(|s| s.parse::<i32>().ok()) .collect(); println!("Results: {:?}", numbers); }
Collect the failed items with map_err()
and filter_map()
map_err
calls a function with the error, so by adding that to the previous filter_map
solution we can save them off to the side while iterating.
fn main() { let strings = vec!["42", "tofu", "93", "999", "18"]; let mut errors = vec![]; let numbers: Vec<_> = strings .into_iter() .map(|s| s.parse::<u8>()) .filter_map(|r| r.map_err(|e| errors.push(e)).ok()) .collect(); println!("Numbers: {:?}", numbers); println!("Errors: {:?}", errors); }
Fail the entire operation with collect()
Result
implements FromIterator
so that a vector of results (Vec<Result<T, E>>
) can be turned into a result with a vector (Result<Vec<T>, E>
). Once an Result::Err
is found, the iteration will terminate.
fn main() { let strings = vec!["tofu", "93", "18"]; let numbers: Result<Vec<_>, _> = strings .into_iter() .map(|s| s.parse::<i32>()) .collect(); println!("Results: {:?}", numbers); }
This same technique can be used with Option
.
Collect all valid values and failures with partition()
fn main() { let strings = vec!["tofu", "93", "18"]; let (numbers, errors): (Vec<_>, Vec<_>) = strings .into_iter() .map(|s| s.parse::<i32>()) .partition(Result::is_ok); println!("Numbers: {:?}", numbers); println!("Errors: {:?}", errors); }
When you look at the results, you'll note that everything is still wrapped in Result
. A little more boilerplate is needed for this.
fn main() { let strings = vec!["tofu", "93", "18"]; let (numbers, errors): (Vec<_>, Vec<_>) = strings .into_iter() .map(|s| s.parse::<i32>()) .partition(Result::is_ok); let numbers: Vec<_> = numbers.into_iter().map(Result::unwrap).collect(); let errors: Vec<_> = errors.into_iter().map(Result::unwrap_err).collect(); println!("Numbers: {:?}", numbers); println!("Errors: {:?}", errors); }
Std library types
The std
library provides many custom types which expands drastically on the primitives
. Some of these include:
- growable
String
s like:"hello world"
- growable vectors:
[1, 2, 3]
- optional types:
Option<i32>
- error handling types:
Result<i32, i32>
- heap allocated pointers:
Box<i32>
See also:
primitives and the std library
Box, stack and heap
All values in Rust are stack allocated by default. Values can be boxed (allocated on the heap) by creating a Box<T>
. A box is a smart pointer to a heap allocated value of type T
. When a box goes out of scope, its destructor is called, the inner object is destroyed, and the memory on the heap is freed.
Boxed values can be dereferenced using the *
operator; this removes one layer of indirection.
use std::mem; #[allow(dead_code)] #[derive(Debug, Clone, Copy)] struct Point { x: f64, y: f64, } // A Rectangle can be specified by where its top left and bottom right // corners are in space #[allow(dead_code)] struct Rectangle { top_left: Point, bottom_right: Point, } fn origin() -> Point { Point { x: 0.0, y: 0.0 } } fn boxed_origin() -> Box<Point> { // Allocate this point on the heap, and return a pointer to it Box::new(Point { x: 0.0, y: 0.0 }) } fn main() { // (all the type annotations are superfluous) // Stack allocated variables let point: Point = origin(); let rectangle: Rectangle = Rectangle { top_left: origin(), bottom_right: Point { x: 3.0, y: -4.0 } }; // Heap allocated rectangle let boxed_rectangle: Box<Rectangle> = Box::new(Rectangle { top_left: origin(), bottom_right: Point { x: 3.0, y: -4.0 }, }); // The output of functions can be boxed let boxed_point: Box<Point> = Box::new(origin()); // Double indirection let box_in_a_box: Box<Box<Point>> = Box::new(boxed_origin()); println!("Point occupies {} bytes on the stack", mem::size_of_val(&point)); println!("Rectangle occupies {} bytes on the stack", mem::size_of_val(&rectangle)); // box size == pointer size println!("Boxed point occupies {} bytes on the stack", mem::size_of_val(&boxed_point)); println!("Boxed rectangle occupies {} bytes on the stack", mem::size_of_val(&boxed_rectangle)); println!("Boxed box occupies {} bytes on the stack", mem::size_of_val(&box_in_a_box)); // Copy the data contained in `boxed_point` into `unboxed_point` let unboxed_point: Point = *boxed_point; println!("Unboxed point occupies {} bytes on the stack", mem::size_of_val(&unboxed_point)); }
Vectors
Vectors are re-sizable arrays. Like slices, their size is not known at compile time, but they can grow or shrink at any time. A vector is represented using 3 parameters:
- pointer to the data
- length
- capacity
The capacity indicates how much memory is reserved for the vector. The vector can grow as long as the length is smaller than the capacity. When this threshold needs to be surpassed, the vector is reallocated with a larger capacity:
fn main() { // Iterators can be collected into vectors let collected_iterator: Vec<i32> = (0..10).collect(); println!("Collected (0..10) into: {:?}", collected_iterator); // The `vec!` macro can be used to initialize a vector let mut xs = vec![1i32, 2, 3]; println!("Initial vector: {:?}", xs); // Insert new element at the end of the vector println!("Push 4 into the vector"); xs.push(4); println!("Vector: {:?}", xs); // Error! Immutable vectors can't grow collected_iterator.push(0); // FIXME ^ Comment out this line // The `len` method yields the number of elements currently stored in a vector println!("Vector length: {}", xs.len()); // Indexing is done using the square brackets (indexing starts at 0) println!("Second element: {}", xs[1]); // `pop` removes the last element from the vector and returns it println!("Pop last element: {:?}", xs.pop()); // Out of bounds indexing yields a panic println!("Fourth element: {}", xs[3]); // FIXME ^ Comment out this line // `Vector`s can be easily iterated over println!("Contents of xs:"); for x in xs.iter() { println!("> {}", x); } // A `Vector` can also be iterated over while the iteration // count is enumerated in a separate variable (`i`) for (i, x) in xs.iter().enumerate() { println!("In position {} we have value {}", i, x); } // Thanks to `iter_mut`, mutable `Vector`s can also be iterated // over in a way that allows modifying each value for x in xs.iter_mut() { *x *= 3; } println!("Updated vector: {:?}", xs); }
More Vec
methods can be found under the std::vec module
Strings
There are two types of strings in Rust: String
and &str
.
A String
is stored as a vector of bytes (Vec<u8>
), but guaranteed to always be a valid UTF-8 sequence. String
is heap allocated, growable and not null terminated.
&str
is a slice (&[u8]
) that always points to a valid UTF-8 sequence, and can be used to view into a String
, just like &[T]
is a view into Vec<T>
:
fn main() { // (all the type annotations are superfluous) // A reference to a string allocated in read only memory let pangram: &'static str = "the quick brown fox jumps over the lazy dog"; println!("Pangram: {}", pangram); // Iterate over words in reverse, no new string is allocated println!("Words in reverse"); for word in pangram.split_whitespace().rev() { println!("> {}", word); } // Copy chars into a vector, sort and remove duplicates let mut chars: Vec<char> = pangram.chars().collect(); chars.sort(); chars.dedup(); // Create an empty and growable `String` let mut string = String::new(); for c in chars { // Insert a char at the end of string string.push(c); // Insert a string at the end of string string.push_str(", "); } // The trimmed string is a slice to the original string, hence no new // allocation is performed let chars_to_trim: &[char] = &[' ', ',']; let trimmed_str: &str = string.trim_matches(chars_to_trim); println!("Used characters: {}", trimmed_str); // Heap allocate a string let alice = String::from("I like dogs"); // Allocate new memory and store the modified string there let bob: String = alice.replace("dog", "cat"); println!("Alice says: {}", alice); println!("Bob says: {}", bob); }
More str
/String
methods can be found under the std::str and std::string modules
Literals and escapes
There are multiple ways to write string literals with special characters in them. All result in a similar &str
so it's best to use the form that is the most convenient to write. Similarly there are multiple ways to write byte string literals, which all result in &[u8; N]
.
Generally special characters are escaped with a backslash character: \
. This way you can add any character to your string, even unprintable ones and ones that you don't know how to type. If you want a literal backslash, escape it with another one: \\
String or character literal delimiters occurring within a literal must be escaped: "\""
, '\''
.
fn main() { // You can use escapes to write bytes by their hexadecimal values... let byte_escape = "I'm writing \x52\x75\x73\x74!"; println!("What are you doing\x3F (\\x3F means ?) {}", byte_escape); // ...or Unicode code points. let unicode_codepoint = "\u{211D}"; let character_name = "\"DOUBLE-STRUCK CAPITAL R\""; println!("Unicode character {} (U+211D) is called {}", unicode_codepoint, character_name ); let long_string = "String literals can span multiple lines. The linebreak and indentation here ->\ <- can be escaped too!"; println!("{}", long_string); }
Sometimes there are just too many characters that need to be escaped or it's just much more convenient to write a string out as-is. This is where raw string literals come into play.
fn main() { let raw_str = r"Escapes don't work here: \x3F \u{211D}"; println!("{}", raw_str); // If you need quotes in a raw string, add a pair of #s let quotes = r#"And then I said: "There is no escape!""#; println!("{}", quotes); // If you need "# in your string, just use more #s in the delimiter. // You can use up to 65535 #s. let longer_delimiter = r###"A string with "# in it. And even "##!"###; println!("{}", longer_delimiter); }
Want a string that's not UTF-8? (Remember, str
and String
must be valid UTF-8). Or maybe you want an array of bytes that's mostly text? Byte strings to the rescue!
use std::str; fn main() { // Note that this is not actually a `&str` let bytestring: &[u8; 21] = b"this is a byte string"; // Byte arrays don't have the `Display` trait, so printing them is a bit limited println!("A byte string: {:?}", bytestring); // Byte strings can have byte escapes... let escaped = b"\x52\x75\x73\x74 as bytes"; // ...but no unicode escapes // let escaped = b"\u{211D} is not allowed"; println!("Some escaped bytes: {:?}", escaped); // Raw byte strings work just like raw strings let raw_bytestring = br"\u{211D} is not escaped here"; println!("{:?}", raw_bytestring); // Converting a byte array to `str` can fail if let Ok(my_str) = str::from_utf8(raw_bytestring) { println!("And the same as text: '{}'", my_str); } let _quotes = br#"You can also use "fancier" formatting, \ like with normal raw strings"#; // Byte strings don't have to be UTF-8 let shift_jis = b"\x82\xe6\x82\xa8\x82\xb1\x82\xbb"; // "ようこそ" in SHIFT-JIS // But then they can't always be converted to `str` match str::from_utf8(shift_jis) { Ok(my_str) => println!("Conversion successful: '{}'", my_str), Err(e) => println!("Conversion failed: {:?}", e), }; }
For conversions between character encodings check out the encoding crate.
A more detailed listing of the ways to write string literals and escape characters is given in the 'Tokens' chapter of the Rust Reference.
Option
Sometimes it's desirable to catch the failure of some parts of a program instead of calling panic!
; this can be accomplished using the Option
enum.
The Option<T>
enum has two variants:
None
, to indicate failure or lack of value, andSome(value)
, a tuple struct that wraps avalue
with typeT
.
// An integer division that doesn't `panic!` fn checked_division(dividend: i32, divisor: i32) -> Option<i32> { if divisor == 0 { // Failure is represented as the `None` variant None } else { // Result is wrapped in a `Some` variant Some(dividend / divisor) } } // This function handles a division that may not succeed fn try_division(dividend: i32, divisor: i32) { // `Option` values can be pattern matched, just like other enums match checked_division(dividend, divisor) { None => println!("{} / {} failed!", dividend, divisor), Some(quotient) => { println!("{} / {} = {}", dividend, divisor, quotient) }, } } fn main() { try_division(4, 2); try_division(1, 0); // Binding `None` to a variable needs to be type annotated let none: Option<i32> = None; let _equivalent_none = None::<i32>; let optional_float = Some(0f32); // Unwrapping a `Some` variant will extract the value wrapped. println!("{:?} unwraps to {:?}", optional_float, optional_float.unwrap()); // Unwrapping a `None` variant will `panic!` println!("{:?} unwraps to {:?}", none, none.unwrap()); }
Result
We've seen that the Option
enum can be used as a return value from functions that may fail, where None
can be returned to indicate failure. However, sometimes it is important to express why an operation failed. To do this we have the Result
enum.
The Result<T, E>
enum has two variants:
Ok(value)
which indicates that the operation succeeded, and wraps thevalue
returned by the operation. (value
has typeT
)Err(why)
, which indicates that the operation failed, and wrapswhy
, which (hopefully) explains the cause of the failure. (why
has typeE
)
mod checked { // Mathematical "errors" we want to catch #[derive(Debug)] pub enum MathError { DivisionByZero, NonPositiveLogarithm, NegativeSquareRoot, } pub type MathResult = Result<f64, MathError>; pub fn div(x: f64, y: f64) -> MathResult { if y == 0.0 { // This operation would `fail`, instead let's return the reason of // the failure wrapped in `Err` Err(MathError::DivisionByZero) } else { // This operation is valid, return the result wrapped in `Ok` Ok(x / y) } } pub fn sqrt(x: f64) -> MathResult { if x < 0.0 { Err(MathError::NegativeSquareRoot) } else { Ok(x.sqrt()) } } pub fn ln(x: f64) -> MathResult { if x <= 0.0 { Err(MathError::NonPositiveLogarithm) } else { Ok(x.ln()) } } } // `op(x, y)` === `sqrt(ln(x / y))` fn op(x: f64, y: f64) -> f64 { // This is a three level match pyramid! match checked::div(x, y) { Err(why) => panic!("{:?}", why), Ok(ratio) => match checked::ln(ratio) { Err(why) => panic!("{:?}", why), Ok(ln) => match checked::sqrt(ln) { Err(why) => panic!("{:?}", why), Ok(sqrt) => sqrt, }, }, } } fn main() { // Will this fail? println!("{}", op(1.0, 10.0)); }
?
Chaining results using match can get pretty untidy; luckily, the ?
operator can be used to make things pretty again. ?
is used at the end of an expression returning a Result
, and is equivalent to a match expression, where the Err(err)
branch expands to an early return Err(From::from(err))
, and the Ok(ok)
branch expands to an ok
expression:
mod checked { #[derive(Debug)] enum MathError { DivisionByZero, NonPositiveLogarithm, NegativeSquareRoot, } type MathResult = Result<f64, MathError>; fn div(x: f64, y: f64) -> MathResult { if y == 0.0 { Err(MathError::DivisionByZero) } else { Ok(x / y) } } fn sqrt(x: f64) -> MathResult { if x < 0.0 { Err(MathError::NegativeSquareRoot) } else { Ok(x.sqrt()) } } fn ln(x: f64) -> MathResult { if x <= 0.0 { Err(MathError::NonPositiveLogarithm) } else { Ok(x.ln()) } } // Intermediate function fn op_(x: f64, y: f64) -> MathResult { // if `div` "fails", then `DivisionByZero` will be `return`ed let ratio = div(x, y)?; // if `ln` "fails", then `NonPositiveLogarithm` will be `return`ed let ln = ln(ratio)?; sqrt(ln) } pub fn op(x: f64, y: f64) { match op_(x, y) { Err(why) => panic!("{}", match why { MathError::NonPositiveLogarithm => "logarithm of non-positive number", MathError::DivisionByZero => "division by zero", MathError::NegativeSquareRoot => "square root of negative number", }), Ok(value) => println!("{}", value), } } } fn main() { checked::op(1.0, 10.0); }
Be sure to check the documentation, as there are many methods to map/compose Result
.
panic!
The panic!
macro can be used to generate a panic and start unwinding its stack. While unwinding, the runtime will take care of freeing all the resources owned by the thread by calling the destructor of all its objects.
Since we are dealing with programs with only one thread, panic!
will cause the program to report the panic message and exit.
// Re-implementation of integer division (/) fn division(dividend: i32, divisor: i32) -> i32 { if divisor == 0 { // Division by zero triggers a panic panic!("division by zero"); } else { dividend / divisor } } // The `main` task fn main() { // Heap allocated integer let _x = Box::new(0i32); // This operation will trigger a task failure division(3, 0); println!("This point won't be reached!"); // `_x` should get destroyed at this point }
Let's check that panic!
doesn't leak memory.
$ rustc panic.rs && valgrind ./panic
==4401== Memcheck, a memory error detector
==4401== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==4401== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info
==4401== Command: ./panic
==4401==
thread '<main>' panicked at 'division by zero', panic.rs:5
==4401==
==4401== HEAP SUMMARY:
==4401== in use at exit: 0 bytes in 0 blocks
==4401== total heap usage: 18 allocs, 18 frees, 1,648 bytes allocated
==4401==
==4401== All heap blocks were freed -- no leaks are possible
==4401==
==4401== For counts of detected and suppressed errors, rerun with: -v
==4401== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
HashMap
Where vectors store values by an integer index, HashMap
s store values by key. HashMap
keys can be booleans, integers, strings, or any other type that implements the Eq
and Hash
traits. More on this in the next section.
Like vectors, HashMap
s are growable, but HashMaps can also shrink themselves when they have excess space. You can create a HashMap with a certain starting capacity using HashMap::with_capacity(uint)
, or use HashMap::new()
to get a HashMap with a default initial capacity (recommended):
use std::collections::HashMap; fn call(number: &str) -> &str { match number { "798-1364" => "We're sorry, the call cannot be completed as dialed. Please hang up and try again.", "645-7689" => "Hello, this is Mr. Awesome's Pizza. My name is Fred. What can I get for you today?", _ => "Hi! Who is this again?" } } fn main() { let mut contacts = HashMap::new(); contacts.insert("Daniel", "798-1364"); contacts.insert("Ashley", "645-7689"); contacts.insert("Katie", "435-8291"); contacts.insert("Robert", "956-1745"); // Takes a reference and returns Option<&V> match contacts.get(&"Daniel") { Some(&number) => println!("Calling Daniel: {}", call(number)), _ => println!("Don't have Daniel's number."), } // `HashMap::insert()` returns `None` // if the inserted value is new, `Some(value)` otherwise contacts.insert("Daniel", "164-6743"); match contacts.get(&"Ashley") { Some(&number) => println!("Calling Ashley: {}", call(number)), _ => println!("Don't have Ashley's number."), } contacts.remove(&"Ashley"); // `HashMap::iter()` returns an iterator that yields // (&'a key, &'a value) pairs in arbitrary order. for (contact, &number) in contacts.iter() { println!("Calling {}: {}", contact, call(number)); } }
For more information on how hashing and hash maps (sometimes called hash tables) work, have a look at Hash Table Wikipedia
Alternate/custom key types
Any type that implements the Eq
and Hash
traits can be a key in HashMap
. This includes:
bool
(though not very useful since there is only two possible keys)int
,uint
, and all variations thereofString
and&str
(protip: you can have aHashMap
keyed byString
and call.get()
with an&str
)
Note that f32
and f64
do not implement Hash
, likely because floating-point precision errors would make using them as hashmap keys horribly error-prone.
All collection classes implement Eq
and Hash
if their contained type also respectively implements Eq
and Hash
. For example, Vec<T>
will implement Hash
if T
implements Hash
.
You can easily implement Eq
and Hash
for a custom type with just one line: #[derive(PartialEq, Eq, Hash)]
The compiler will do the rest. If you want more control over the details, you can implement Eq
and/or Hash
yourself. This guide will not cover the specifics of implementing Hash
.
To play around with using a struct
in HashMap
, let's try making a very simple user logon system:
use std::collections::HashMap; // Eq requires that you derive PartialEq on the type. #[derive(PartialEq, Eq, Hash)] struct Account<'a>{ username: &'a str, password: &'a str, } struct AccountInfo<'a>{ name: &'a str, email: &'a str, } type Accounts<'a> = HashMap<Account<'a>, AccountInfo<'a>>; fn try_logon<'a>(accounts: &Accounts<'a>, username: &'a str, password: &'a str){ println!("Username: {}", username); println!("Password: {}", password); println!("Attempting logon..."); let logon = Account { username, password, }; match accounts.get(&logon) { Some(account_info) => { println!("Successful logon!"); println!("Name: {}", account_info.name); println!("Email: {}", account_info.email); }, _ => println!("Login failed!"), } } fn main(){ let mut accounts: Accounts = HashMap::new(); let account = Account { username: "j.everyman", password: "password123", }; let account_info = AccountInfo { name: "John Everyman", email: "j.everyman@email.com", }; accounts.insert(account, account_info); try_logon(&accounts, "j.everyman", "psasword123"); try_logon(&accounts, "j.everyman", "password123"); }
HashSet
Consider a HashSet
as a HashMap
where we just care about the keys (HashSet<T>
is, in actuality, just a wrapper around HashMap<T, ()>
).
"What's the point of that?" you ask. "I could just store the keys in a Vec
."
A HashSet
's unique feature is that it is guaranteed to not have duplicate elements. That's the contract that any set collection fulfills. HashSet
is just one implementation. (see also: BTreeSet
)
If you insert a value that is already present in the HashSet
, (i.e. the new value is equal to the existing and they both have the same hash), then the new value will replace the old.
This is great for when you never want more than one of something, or when you want to know if you've already got something.
But sets can do more than that.
Sets have 4 primary operations (all of the following calls return an iterator):
-
union
: get all the unique elements in both sets. -
difference
: get all the elements that are in the first set but not the second. -
intersection
: get all the elements that are only in both sets. -
symmetric_difference
: get all the elements that are in one set or the other, but not both.
Try all of these in the following example:
use std::collections::HashSet; fn main() { let mut a: HashSet<i32> = vec![1i32, 2, 3].into_iter().collect(); let mut b: HashSet<i32> = vec![2i32, 3, 4].into_iter().collect(); assert!(a.insert(4)); assert!(a.contains(&4)); // `HashSet::insert()` returns false if // there was a value already present. assert!(b.insert(4), "Value 4 is already in set B!"); // FIXME ^ Comment out this line b.insert(5); // If a collection's element type implements `Debug`, // then the collection implements `Debug`. // It usually prints its elements in the format `[elem1, elem2, ...]` println!("A: {:?}", a); println!("B: {:?}", b); // Print [1, 2, 3, 4, 5] in arbitrary order println!("Union: {:?}", a.union(&b).collect::<Vec<&i32>>()); // This should print [1] println!("Difference: {:?}", a.difference(&b).collect::<Vec<&i32>>()); // Print [2, 3, 4] in arbitrary order. println!("Intersection: {:?}", a.intersection(&b).collect::<Vec<&i32>>()); // Print [1, 5] println!("Symmetric Difference: {:?}", a.symmetric_difference(&b).collect::<Vec<&i32>>()); }
(Examples are adapted from the documentation.)
Rc
When multiple ownership is needed, Rc
(Reference Counting) can be used. Rc
keeps track of the number of the references which means the number of owners of the value wrapped inside an Rc
.
Reference count of an Rc
increases by 1 whenever an Rc
is cloned, and decreases by 1 whenever one cloned Rc
is dropped out of the scope. When an Rc
's reference count becomes zero (which means there are no remaining owners), both the Rc
and the value are all dropped.
Cloning an Rc
never performs a deep copy. Cloning creates just another pointer to the wrapped value, and increments the count:
use std::rc::Rc; fn main() { let rc_examples = "Rc examples".to_string(); { println!("--- rc_a is created ---"); let rc_a: Rc<String> = Rc::new(rc_examples); println!("Reference Count of rc_a: {}", Rc::strong_count(&rc_a)); { println!("--- rc_a is cloned to rc_b ---"); let rc_b: Rc<String> = Rc::clone(&rc_a); println!("Reference Count of rc_b: {}", Rc::strong_count(&rc_b)); println!("Reference Count of rc_a: {}", Rc::strong_count(&rc_a)); // Two `Rc`s are equal if their inner values are equal println!("rc_a and rc_b are equal: {}", rc_a.eq(&rc_b)); // We can use methods of a value directly println!("Length of the value inside rc_a: {}", rc_a.len()); println!("Value of rc_b: {}", rc_b); println!("--- rc_b is dropped out of scope ---"); } println!("Reference Count of rc_a: {}", Rc::strong_count(&rc_a)); println!("--- rc_a is dropped out of scope ---"); } // Error! `rc_examples` already moved into `rc_a` // And when `rc_a` is dropped, `rc_examples` is dropped together // println!("rc_examples: {}", rc_examples); // TODO ^ Try uncommenting this line }
See also:
std::rc and std::sync::arc.
Arc
When shared ownership between threads is needed, Arc
(Atomically Reference Counted) can be used. This struct, via the Clone
implementation can create a reference pointer for the location of a value in the memory heap while increasing the reference counter. As it shares ownership between threads, when the last reference pointer to a value is out of scope, the variable is dropped:
use std::time::Duration; use std::sync::Arc; use std::thread; fn main() { // This variable declaration is where its value is specified. let apple = Arc::new("the same apple"); for _ in 0..10 { // Here there is no value specification as it is a pointer to a // reference in the memory heap. let apple = Arc::clone(&apple); thread::spawn(move || { // As Arc was used, threads can be spawned using the value allocated // in the Arc variable pointer's location. println!("{:?}", apple); }); } // Make sure all Arc instances are printed from spawned threads. thread::sleep(Duration::from_secs(1)); }
Std misc
Many other types are provided by the std library to support things such as:
- Threads
- Channels
- File I/O
These expand beyond what the primitives provide.
See also:
primitives and the std library
Threads
Rust provides a mechanism for spawning native OS threads via the spawn
function, the argument of this function is a moving closure.
use std::thread; const NTHREADS: u32 = 10; // This is the `main` thread fn main() { // Make a vector to hold the children which are spawned. let mut children = vec![]; for i in 0..NTHREADS { // Spin up another thread children.push(thread::spawn(move || { println!("this is thread number {}", i); })); } for child in children { // Wait for the thread to finish. Returns a result. let _ = child.join(); } }
These threads will be scheduled by the OS.
Testcase: map-reduce
Rust makes it very easy to parallelise data processing, without many of the headaches traditionally associated with such an attempt.
The standard library provides great threading primitives out of the box. These, combined with Rust's concept of Ownership and aliasing rules, automatically prevent data races.
The aliasing rules (one writable reference XOR many readable references) automatically prevent you from manipulating state that is visible to other threads. (Where synchronisation is needed, there are synchronisation primitives like Mutex
es or Channel
s.)
In this example, we will calculate the sum of all digits in a block of numbers. We will do this by parcelling out chunks of the block into different threads. Each thread will sum its tiny block of digits, and subsequently we will sum the intermediate sums produced by each thread.
Note that, although we're passing references across thread boundaries, Rust understands that we're only passing read-only references, and that thus no unsafety or data races can occur. Also because the references we're passing have 'static
lifetimes, Rust understands that our data won't be destroyed while these threads are still running. (When you need to share non-static
data between threads, you can use a smart pointer like Arc
to keep the data alive and avoid non-static
lifetimes.)
use std::thread; // This is the `main` thread fn main() { // This is our data to process. // We will calculate the sum of all digits via a threaded map-reduce algorithm. // Each whitespace separated chunk will be handled in a different thread. // // TODO: see what happens to the output if you insert spaces! let data = "86967897737416471853297327050364959 11861322575564723963297542624962850 70856234701860851907960690014725639 38397966707106094172783238747669219 52380795257888236525459303330302837 58495327135744041048897885734297812 69920216438980873548808413720956532 16278424637452589860345374828574668"; // Make a vector to hold the child-threads which we will spawn. let mut children = vec![]; /************************************************************************* * "Map" phase * * Divide our data into segments, and apply initial processing ************************************************************************/ // split our data into segments for individual calculation // each chunk will be a reference (&str) into the actual data let chunked_data = data.split_whitespace(); // Iterate over the data segments. // .enumerate() adds the current loop index to whatever is iterated // the resulting tuple "(index, element)" is then immediately // "destructured" into two variables, "i" and "data_segment" with a // "destructuring assignment" for (i, data_segment) in chunked_data.enumerate() { println!("data segment {} is \"{}\"", i, data_segment); // Process each data segment in a separate thread // // spawn() returns a handle to the new thread, // which we MUST keep to access the returned value // // 'move || -> u32' is syntax for a closure that: // * takes no arguments ('||') // * takes ownership of its captured variables ('move') and // * returns an unsigned 32-bit integer ('-> u32') // // Rust is smart enough to infer the '-> u32' from // the closure itself so we could have left that out. // // TODO: try removing the 'move' and see what happens children.push(thread::spawn(move || -> u32 { // Calculate the intermediate sum of this segment: let result = data_segment // iterate over the characters of our segment.. .chars() // .. convert text-characters to their number value.. .map(|c| c.to_digit(10).expect("should be a digit")) // .. and sum the resulting iterator of numbers .sum(); // println! locks stdout, so no text-interleaving occurs println!("processed segment {}, result={}", i, result); // "return" not needed, because Rust is an "expression language", the // last evaluated expression in each block is automatically its value. result })); } /************************************************************************* * "Reduce" phase * * Collect our intermediate results, and combine them into a final result ************************************************************************/ // combine each thread's intermediate results into a single final sum. // // we use the "turbofish" ::<> to provide sum() with a type hint. // // TODO: try without the turbofish, by instead explicitly // specifying the type of final_result let final_result = children.into_iter().map(|c| c.join().unwrap()).sum::<u32>(); println!("Final sum result: {}", final_result); }
Assignments
It is not wise to let our number of threads depend on user inputted data. What if the user decides to insert a lot of spaces? Do we really want to spawn 2,000 threads? Modify the program so that the data is always chunked into a limited number of chunks, defined by a static constant at the beginning of the program.
See also:
- Threads
- vectors and iterators
- closures, move semantics and
move
closures - destructuring assignments
- turbofish notation to help type inference
- unwrap vs. expect
- enumerate
Channels
Rust provides asynchronous channels
for communication between threads. Channels allow a unidirectional flow of information between two end-points: the Sender
and the Receiver
.
use std::sync::mpsc::{Sender, Receiver}; use std::sync::mpsc; use std::thread; static NTHREADS: i32 = 3; fn main() { // Channels have two endpoints: the `Sender<T>` and the `Receiver<T>`, // where `T` is the type of the message to be transferred // (type annotation is superfluous) let (tx, rx): (Sender<i32>, Receiver<i32>) = mpsc::channel(); let mut children = Vec::new(); for id in 0..NTHREADS { // The sender endpoint can be copied let thread_tx = tx.clone(); // Each thread will send its id via the channel let child = thread::spawn(move || { // The thread takes ownership over `thread_tx` // Each thread queues a message in the channel thread_tx.send(id).unwrap(); // Sending is a non-blocking operation, the thread will continue // immediately after sending its message println!("thread {} finished", id); }); children.push(child); } // Here, all the messages are collected let mut ids = Vec::with_capacity(NTHREADS as usize); for _ in 0..NTHREADS { // The `recv` method picks a message from the channel // `recv` will block the current thread if there are no messages available ids.push(rx.recv()); } // Wait for the threads to complete any remaining work for child in children { child.join().expect("oops! the child thread panicked"); } // Show the order in which the messages were sent println!("{:?}", ids); }
Path
The Path
struct represents file paths in the underlying filesystem. There are two flavors of Path
: posix::Path
, for UNIX-like systems, and windows::Path
, for Windows. The prelude exports the appropriate platform-specific Path
variant.
A Path
can be created from an OsStr
, and provides several methods to get information from the file/directory the path points to.
A Path
is immutable. The owned version of Path
is PathBuf
. The relation between Path
and PathBuf
is similar to that of str
and String
: a PathBuf
can be mutated in-place, and can be dereferenced to a Path
.
Note that a Path
is not internally represented as an UTF-8 string, but instead is stored as an OsString
. Therefore, converting a Path
to a &str
is not free and may fail (an Option
is returned). However, a Path
can be freely converted to an OsString
or &OsStr
using into_os_string
and as_os_str
, respectively.
use std::path::Path; fn main() { // Create a `Path` from an `&'static str` let path = Path::new("."); // The `display` method returns a `Display`able structure let _display = path.display(); // `join` merges a path with a byte container using the OS specific // separator, and returns a `PathBuf` let mut new_path = path.join("a").join("b"); // `push` extends the `PathBuf` with a `&Path` new_path.push("c"); new_path.push("myfile.tar.gz"); // `set_file_name` updates the file name of the `PathBuf` new_path.set_file_name("package.tgz"); // Convert the `PathBuf` into a string slice match new_path.to_str() { None => panic!("new path is not a valid UTF-8 sequence"), Some(s) => println!("new path is {}", s), } }
Be sure to check at other Path
methods (posix::Path
or windows::Path
) and the Metadata
struct.
See also:
File I/O
The File
struct represents a file that has been opened (it wraps a file descriptor), and gives read and/or write access to the underlying file.
Since many things can go wrong when doing file I/O, all the File
methods return the io::Result<T>
type, which is an alias for Result<T, io::Error>
.
This makes the failure of all I/O operations explicit. Thanks to this, the programmer can see all the failure paths, and is encouraged to handle them in a proactive manner.
open
The open
function can be used to open a file in read-only mode.
A File
owns a resource, the file descriptor and takes care of closing the file when it is drop
ed.
use std::fs::File;
use std::io::prelude::*;
use std::path::Path;
fn main() {
// Create a path to the desired file
let path = Path::new("hello.txt");
let display = path.display();
// Open the path in read-only mode, returns `io::Result<File>`
let mut file = match File::open(&path) {
Err(why) => panic!("couldn't open {}: {}", display, why),
Ok(file) => file,
};
// Read the file contents into a string, returns `io::Result<usize>`
let mut s = String::new();
match file.read_to_string(&mut s) {
Err(why) => panic!("couldn't read {}: {}", display, why),
Ok(_) => print!("{} contains:\n{}", display, s),
}
// `file` goes out of scope, and the "hello.txt" file gets closed
}
Here's the expected successful output:
$ echo "Hello World!" > hello.txt
$ rustc open.rs && ./open
hello.txt contains:
Hello World!
(You are encouraged to test the previous example under different failure conditions: hello.txt
doesn't exist, or hello.txt
is not readable, etc.)
create
The create
function opens a file in write-only mode. If the file already existed, the old content is destroyed. Otherwise, a new file is created.
static LOREM_IPSUM: &str =
"Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non
proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
";
use std::fs::File;
use std::io::prelude::*;
use std::path::Path;
fn main() {
let path = Path::new("lorem_ipsum.txt");
let display = path.display();
// Open a file in write-only mode, returns `io::Result<File>`
let mut file = match File::create(&path) {
Err(why) => panic!("couldn't create {}: {}", display, why),
Ok(file) => file,
};
// Write the `LOREM_IPSUM` string to `file`, returns `io::Result<()>`
match file.write_all(LOREM_IPSUM.as_bytes()) {
Err(why) => panic!("couldn't write to {}: {}", display, why),
Ok(_) => println!("successfully wrote to {}", display),
}
}
Here's the expected successful output:
$ rustc create.rs && ./create
successfully wrote to lorem_ipsum.txt
$ cat lorem_ipsum.txt
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non
proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
(As in the previous example, you are encouraged to test this example under failure conditions.)
The OpenOptions
struct can be used to configure how a file is opened.
read_lines
A naive approach
This might be a reasonable first attempt for a beginner's first implementation for reading lines from a file.
#![allow(unused)] fn main() { use std::fs::read_to_string; fn read_lines(filename: &str) -> Vec<String> { let mut result = Vec::new(); for line in read_to_string(filename).unwrap().lines() { result.push(line.to_string()) } result } }
Since the method lines()
returns an iterator over the lines in the file, we can also perform a map inline and collect the results, yielding a more concise and fluent expression.
#![allow(unused)] fn main() { use std::fs::read_to_string; fn read_lines(filename: &str) -> Vec<String> { read_to_string(filename) .unwrap() // panic on possible file-reading errors .lines() // split the string into an iterator of string slices .map(String::from) // make each slice into a string .collect() // gather them together into a vector } }
Note that in both examples above, we must convert the &str
reference returned from lines()
to the owned type String
, using .to_string()
and String::from
respectively.
A more efficient approach
Here we pass ownership of the open File
to a BufReader
struct. BufReader
uses an internal buffer to reduce intermediate allocations.
We also update read_lines
to return an iterator instead of allocating new String
objects in memory for each line.
use std::fs::File; use std::io::{self, BufRead}; use std::path::Path; fn main() { // File hosts.txt must exist in the current path if let Ok(lines) = read_lines("./hosts.txt") { // Consumes the iterator, returns an (Optional) String for line in lines { if let Ok(ip) = line { println!("{}", ip); } } } } // The output is wrapped in a Result to allow matching on errors // Returns an Iterator to the Reader of the lines of the file. fn read_lines<P>(filename: P) -> io::Result<io::Lines<io::BufReader<File>>> where P: AsRef<Path>, { let file = File::open(filename)?; Ok(io::BufReader::new(file).lines()) }
Running this program simply prints the lines individually.
$ echo -e "127.0.0.1\n192.168.0.1\n" > hosts.txt
$ rustc read_lines.rs && ./read_lines
127.0.0.1
192.168.0.1
(Note that since File::open
expects a generic AsRef<Path>
as argument, we define our generic read_lines()
method with the same generic constraint, using the where
keyword.)
This process is more efficient than creating a String
in memory with all of the file's contents. This can especially cause performance issues when working with larger files.
Child processes
The process::Output
struct represents the output of a finished child process, and the process::Command
struct is a process builder.
use std::process::Command;
fn main() {
let output = Command::new("rustc")
.arg("--version")
.output().unwrap_or_else(|e| {
panic!("failed to execute process: {}", e)
});
if output.status.success() {
let s = String::from_utf8_lossy(&output.stdout);
print!("rustc succeeded and stdout was:\n{}", s);
} else {
let s = String::from_utf8_lossy(&output.stderr);
print!("rustc failed and stderr was:\n{}", s);
}
}
(You are encouraged to try the previous example with an incorrect flag passed to rustc
)
Pipes
The std::Child
struct represents a running child process, and exposes the stdin
, stdout
and stderr
handles for interaction with the underlying process via pipes.
use std::io::prelude::*;
use std::process::{Command, Stdio};
static PANGRAM: &'static str =
"the quick brown fox jumped over the lazy dog\n";
fn main() {
// Spawn the `wc` command
let process = match Command::new("wc")
.stdin(Stdio::piped())
.stdout(Stdio::piped())
.spawn() {
Err(why) => panic!("couldn't spawn wc: {}", why),
Ok(process) => process,
};
// Write a string to the `stdin` of `wc`.
//
// `stdin` has type `Option<ChildStdin>`, but since we know this instance
// must have one, we can directly `unwrap` it.
match process.stdin.unwrap().write_all(PANGRAM.as_bytes()) {
Err(why) => panic!("couldn't write to wc stdin: {}", why),
Ok(_) => println!("sent pangram to wc"),
}
// Because `stdin` does not live after the above calls, it is `drop`ed,
// and the pipe is closed.
//
// This is very important, otherwise `wc` wouldn't start processing the
// input we just sent.
// The `stdout` field also has type `Option<ChildStdout>` so must be unwrapped.
let mut s = String::new();
match process.stdout.unwrap().read_to_string(&mut s) {
Err(why) => panic!("couldn't read wc stdout: {}", why),
Ok(_) => print!("wc responded with:\n{}", s),
}
}
Wait
If you'd like to wait for a process::Child
to finish, you must call Child::wait
, which will return a process::ExitStatus
.
use std::process::Command;
fn main() {
let mut child = Command::new("sleep").arg("5").spawn().unwrap();
let _result = child.wait().unwrap();
println!("reached end of main");
}
$ rustc wait.rs && ./wait
# `wait` keeps running for 5 seconds until the `sleep 5` command finishes
reached end of main
Filesystem Operations
The std::fs
module contains several functions that deal with the filesystem.
use std::fs;
use std::fs::{File, OpenOptions};
use std::io;
use std::io::prelude::*;
use std::os::unix;
use std::path::Path;
// A simple implementation of `% cat path`
fn cat(path: &Path) -> io::Result<String> {
let mut f = File::open(path)?;
let mut s = String::new();
match f.read_to_string(&mut s) {
Ok(_) => Ok(s),
Err(e) => Err(e),
}
}
// A simple implementation of `% echo s > path`
fn echo(s: &str, path: &Path) -> io::Result<()> {
let mut f = File::create(path)?;
f.write_all(s.as_bytes())
}
// A simple implementation of `% touch path` (ignores existing files)
fn touch(path: &Path) -> io::Result<()> {
match OpenOptions::new().create(true).write(true).open(path) {
Ok(_) => Ok(()),
Err(e) => Err(e),
}
}
fn main() {
println!("`mkdir a`");
// Create a directory, returns `io::Result<()>`
match fs::create_dir("a") {
Err(why) => println!("! {:?}", why.kind()),
Ok(_) => {},
}
println!("`echo hello > a/b.txt`");
// The previous match can be simplified using the `unwrap_or_else` method
echo("hello", &Path::new("a/b.txt")).unwrap_or_else(|why| {
println!("! {:?}", why.kind());
});
println!("`mkdir -p a/c/d`");
// Recursively create a directory, returns `io::Result<()>`
fs::create_dir_all("a/c/d").unwrap_or_else(|why| {
println!("! {:?}", why.kind());
});
println!("`touch a/c/e.txt`");
touch(&Path::new("a/c/e.txt")).unwrap_or_else(|why| {
println!("! {:?}", why.kind());
});
println!("`ln -s ../b.txt a/c/b.txt`");
// Create a symbolic link, returns `io::Result<()>`
if cfg!(target_family = "unix") {
unix::fs::symlink("../b.txt", "a/c/b.txt").unwrap_or_else(|why| {
println!("! {:?}", why.kind());
});
}
println!("`cat a/c/b.txt`");
match cat(&Path::new("a/c/b.txt")) {
Err(why) => println!("! {:?}", why.kind()),
Ok(s) => println!("> {}", s),
}
println!("`ls a`");
// Read the contents of a directory, returns `io::Result<Vec<Path>>`
match fs::read_dir("a") {
Err(why) => println!("! {:?}", why.kind()),
Ok(paths) => for path in paths {
println!("> {:?}", path.unwrap().path());
},
}
println!("`rm a/c/e.txt`");
// Remove a file, returns `io::Result<()>`
fs::remove_file("a/c/e.txt").unwrap_or_else(|why| {
println!("! {:?}", why.kind());
});
println!("`rmdir a/c/d`");
// Remove an empty directory, returns `io::Result<()>`
fs::remove_dir("a/c/d").unwrap_or_else(|why| {
println!("! {:?}", why.kind());
});
}
Here's the expected successful output:
$ rustc fs.rs && ./fs
`mkdir a`
`echo hello > a/b.txt`
`mkdir -p a/c/d`
`touch a/c/e.txt`
`ln -s ../b.txt a/c/b.txt`
`cat a/c/b.txt`
> hello
`ls a`
> "a/b.txt"
> "a/c"
`rm a/c/e.txt`
`rmdir a/c/d`
And the final state of the a
directory is:
$ tree a
a
|-- b.txt
`-- c
`-- b.txt -> ../b.txt
1 directory, 2 files
An alternative way to define the function cat
is with ?
notation:
fn cat(path: &Path) -> io::Result<String> {
let mut f = File::open(path)?;
let mut s = String::new();
f.read_to_string(&mut s)?;
Ok(s)
}
See also:
Program arguments
Standard Library
The command line arguments can be accessed using std::env::args
, which returns an iterator that yields a String
for each argument:
use std::env; fn main() { let args: Vec<String> = env::args().collect(); // The first argument is the path that was used to call the program. println!("My path is {}.", args[0]); // The rest of the arguments are the passed command line parameters. // Call the program like this: // $ ./args arg1 arg2 println!("I got {:?} arguments: {:?}.", args.len() - 1, &args[1..]); }
$ ./args 1 2 3
My path is ./args.
I got 3 arguments: ["1", "2", "3"].
Crates
Alternatively, there are numerous crates that can provide extra functionality when creating command-line applications. The Rust Cookbook exhibits best practices on how to use one of the more popular command line argument crates, clap
.
Argument parsing
Matching can be used to parse simple arguments:
use std::env; fn increase(number: i32) { println!("{}", number + 1); } fn decrease(number: i32) { println!("{}", number - 1); } fn help() { println!("usage: match_args <string> Check whether given string is the answer. match_args {{increase|decrease}} <integer> Increase or decrease given integer by one."); } fn main() { let args: Vec<String> = env::args().collect(); match args.len() { // no arguments passed 1 => { println!("My name is 'match_args'. Try passing some arguments!"); }, // one argument passed 2 => { match args[1].parse() { Ok(42) => println!("This is the answer!"), _ => println!("This is not the answer."), } }, // one command and one argument passed 3 => { let cmd = &args[1]; let num = &args[2]; // parse the number let number: i32 = match num.parse() { Ok(n) => { n }, Err(_) => { eprintln!("error: second argument not an integer"); help(); return; }, }; // parse the command match &cmd[..] { "increase" => increase(number), "decrease" => decrease(number), _ => { eprintln!("error: invalid command"); help(); }, } }, // all the other cases _ => { // show a help message help(); } } }
$ ./match_args Rust
This is not the answer.
$ ./match_args 42
This is the answer!
$ ./match_args do something
error: second argument not an integer
usage:
match_args <string>
Check whether given string is the answer.
match_args {increase|decrease} <integer>
Increase or decrease given integer by one.
$ ./match_args do 42
error: invalid command
usage:
match_args <string>
Check whether given string is the answer.
match_args {increase|decrease} <integer>
Increase or decrease given integer by one.
$ ./match_args increase 42
43
Foreign Function Interface
Rust provides a Foreign Function Interface (FFI) to C libraries. Foreign functions must be declared inside an extern
block annotated with a #[link]
attribute containing the name of the foreign library.
use std::fmt;
// this extern block links to the libm library
#[link(name = "m")]
extern {
// this is a foreign function
// that computes the square root of a single precision complex number
fn csqrtf(z: Complex) -> Complex;
fn ccosf(z: Complex) -> Complex;
}
// Since calling foreign functions is considered unsafe,
// it's common to write safe wrappers around them.
fn cos(z: Complex) -> Complex {
unsafe { ccosf(z) }
}
fn main() {
// z = -1 + 0i
let z = Complex { re: -1., im: 0. };
// calling a foreign function is an unsafe operation
let z_sqrt = unsafe { csqrtf(z) };
println!("the square root of {:?} is {:?}", z, z_sqrt);
// calling safe API wrapped around unsafe operation
println!("cos({:?}) = {:?}", z, cos(z));
}
// Minimal implementation of single precision complex numbers
#[repr(C)]
#[derive(Clone, Copy)]
struct Complex {
re: f32,
im: f32,
}
impl fmt::Debug for Complex {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
if self.im < 0. {
write!(f, "{}-{}i", self.re, -self.im)
} else {
write!(f, "{}+{}i", self.re, self.im)
}
}
}
Testing
Rust is a programming language that cares a lot about correctness and it includes support for writing software tests within the language itself.
Testing comes in three styles:
- Unit testing.
- Doc testing.
- Integration testing.
Also Rust has support for specifying additional dependencies for tests:
See Also
- The Book chapter on testing
- API Guidelines on doc-testing
Unit testing
Tests are Rust functions that verify that the non-test code is functioning in the expected manner. The bodies of test functions typically perform some setup, run the code we want to test, then assert whether the results are what we expect.
Most unit tests go into a tests
mod with the #[cfg(test)]
attribute. Test functions are marked with the #[test]
attribute.
Tests fail when something in the test function panics. There are some helper macros:
assert!(expression)
- panics if expression evaluates tofalse
.assert_eq!(left, right)
andassert_ne!(left, right)
- testing left and right expressions for equality and inequality respectively.
pub fn add(a: i32, b: i32) -> i32 {
a + b
}
// This is a really bad adding function, its purpose is to fail in this
// example.
#[allow(dead_code)]
fn bad_add(a: i32, b: i32) -> i32 {
a - b
}
#[cfg(test)]
mod tests {
// Note this useful idiom: importing names from outer (for mod tests) scope.
use super::*;
#[test]
fn test_add() {
assert_eq!(add(1, 2), 3);
}
#[test]
fn test_bad_add() {
// This assert would fire and test will fail.
// Please note, that private functions can be tested too!
assert_eq!(bad_add(1, 2), 3);
}
}
Tests can be run with cargo test
.
$ cargo test
running 2 tests
test tests::test_bad_add ... FAILED
test tests::test_add ... ok
failures:
---- tests::test_bad_add stdout ----
thread 'tests::test_bad_add' panicked at 'assertion failed: `(left == right)`
left: `-1`,
right: `3`', src/lib.rs:21:8
note: Run with `RUST_BACKTRACE=1` for a backtrace.
failures:
tests::test_bad_add
test result: FAILED. 1 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out
Tests and ?
None of the previous unit test examples had a return type. But in Rust 2018, your unit tests can return Result<()>
, which lets you use ?
in them! This can make them much more concise.
fn sqrt(number: f64) -> Result<f64, String> { if number >= 0.0 { Ok(number.powf(0.5)) } else { Err("negative floats don't have square roots".to_owned()) } } #[cfg(test)] mod tests { use super::*; #[test] fn test_sqrt() -> Result<(), String> { let x = 4.0; assert_eq!(sqrt(x)?.powf(2.0), x); Ok(()) } }
See "The Edition Guide" for more details.
Testing panics
To check functions that should panic under certain circumstances, use attribute #[should_panic]
. This attribute accepts optional parameter expected =
with the text of the panic message. If your function can panic in multiple ways, it helps make sure your test is testing the correct panic.
pub fn divide_non_zero_result(a: u32, b: u32) -> u32 {
if b == 0 {
panic!("Divide-by-zero error");
} else if a < b {
panic!("Divide result is zero");
}
a / b
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_divide() {
assert_eq!(divide_non_zero_result(10, 2), 5);
}
#[test]
#[should_panic]
fn test_any_panic() {
divide_non_zero_result(1, 0);
}
#[test]
#[should_panic(expected = "Divide result is zero")]
fn test_specific_panic() {
divide_non_zero_result(1, 10);
}
}
Running these tests gives us:
$ cargo test
running 3 tests
test tests::test_any_panic ... ok
test tests::test_divide ... ok
test tests::test_specific_panic ... ok
test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Doc-tests tmp-test-should-panic
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Running specific tests
To run specific tests one may specify the test name to cargo test
command.
$ cargo test test_any_panic
running 1 test
test tests::test_any_panic ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 2 filtered out
Doc-tests tmp-test-should-panic
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
To run multiple tests one may specify part of a test name that matches all the tests that should be run.
$ cargo test panic
running 2 tests
test tests::test_any_panic ... ok
test tests::test_specific_panic ... ok
test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 1 filtered out
Doc-tests tmp-test-should-panic
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Ignoring tests
Tests can be marked with the #[ignore]
attribute to exclude some tests. Or to run them with command cargo test -- --ignored
#![allow(unused)] fn main() { pub fn add(a: i32, b: i32) -> i32 { a + b } #[cfg(test)] mod tests { use super::*; #[test] fn test_add() { assert_eq!(add(2, 2), 4); } #[test] fn test_add_hundred() { assert_eq!(add(100, 2), 102); assert_eq!(add(2, 100), 102); } #[test] #[ignore] fn ignored_test() { assert_eq!(add(0, 0), 0); } } }
$ cargo test
running 3 tests
test tests::ignored_test ... ignored
test tests::test_add ... ok
test tests::test_add_hundred ... ok
test result: ok. 2 passed; 0 failed; 1 ignored; 0 measured; 0 filtered out
Doc-tests tmp-ignore
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
$ cargo test -- --ignored
running 1 test
test tests::ignored_test ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Doc-tests tmp-ignore
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Documentation testing
The primary way of documenting a Rust project is through annotating the source code. Documentation comments are written in CommonMark Markdown specification and support code blocks in them. Rust takes care about correctness, so these code blocks are compiled and used as documentation tests.
/// First line is a short summary describing function.
///
/// The next lines present detailed documentation. Code blocks start with
/// triple backquotes and have implicit `fn main()` inside
/// and `extern crate <cratename>`. Assume we're testing `doccomments` crate:
///
/// ```
/// let result = doccomments::add(2, 3);
/// assert_eq!(result, 5);
/// ```
pub fn add(a: i32, b: i32) -> i32 {
a + b
}
/// Usually doc comments may include sections "Examples", "Panics" and "Failures".
///
/// The next function divides two numbers.
///
/// # Examples
///
/// ```
/// let result = doccomments::div(10, 2);
/// assert_eq!(result, 5);
/// ```
///
/// # Panics
///
/// The function panics if the second argument is zero.
///
/// ```rust,should_panic
/// // panics on division by zero
/// doccomments::div(10, 0);
/// ```
pub fn div(a: i32, b: i32) -> i32 {
if b == 0 {
panic!("Divide-by-zero error");
}
a / b
}
Code blocks in documentation are automatically tested when running the regular cargo test
command:
$ cargo test
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Doc-tests doccomments
running 3 tests
test src/lib.rs - add (line 7) ... ok
test src/lib.rs - div (line 21) ... ok
test src/lib.rs - div (line 31) ... ok
test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Motivation behind documentation tests
The main purpose of documentation tests is to serve as examples that exercise the functionality, which is one of the most important guidelines. It allows using examples from docs as complete code snippets. But using ?
makes compilation fail since main
returns unit
. The ability to hide some source lines from documentation comes to the rescue: one may write fn try_main() -> Result<(), ErrorType>
, hide it and unwrap
it in hidden main
. Sounds complicated? Here's an example:
/// Using hidden `try_main` in doc tests.
///
/// ```
/// # // hidden lines start with `#` symbol, but they're still compilable!
/// # fn try_main() -> Result<(), String> { // line that wraps the body shown in doc
/// let res = doccomments::try_div(10, 2)?;
/// # Ok(()) // returning from try_main
/// # }
/// # fn main() { // starting main that'll unwrap()
/// # try_main().unwrap(); // calling try_main and unwrapping
/// # // so that test will panic in case of error
/// # }
/// ```
pub fn try_div(a: i32, b: i32) -> Result<i32, String> {
if b == 0 {
Err(String::from("Divide-by-zero"))
} else {
Ok(a / b)
}
}
See Also
- RFC505 on documentation style
- API Guidelines on documentation guidelines
Integration testing
Unit tests are testing one module in isolation at a time: they're small and can test private code. Integration tests are external to your crate and use only its public interface in the same way any other code would. Their purpose is to test that many parts of your library work correctly together.
Cargo looks for integration tests in tests
directory next to src
.
File src/lib.rs
:
// Define this in a crate called `adder`.
pub fn add(a: i32, b: i32) -> i32 {
a + b
}
File with test: tests/integration_test.rs
:
#[test]
fn test_add() {
assert_eq!(adder::add(3, 2), 5);
}
Running tests with cargo test
command:
$ cargo test
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Running target/debug/deps/integration_test-bcd60824f5fbfe19
running 1 test
test test_add ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Doc-tests adder
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Each Rust source file in the tests
directory is compiled as a separate crate. In order to share some code between integration tests we can make a module with public functions, importing and using it within tests.
File tests/common/mod.rs
:
pub fn setup() {
// some setup code, like creating required files/directories, starting
// servers, etc.
}
File with test: tests/integration_test.rs
// importing common module.
mod common;
#[test]
fn test_add() {
// using common code.
common::setup();
assert_eq!(adder::add(3, 2), 5);
}
Creating the module as tests/common.rs
also works, but is not recommended because the test runner will treat the file as a test crate and try to run tests inside it.
Development dependencies
Sometimes there is a need to have dependencies for tests (or examples, or benchmarks) only. Such dependencies are added to Cargo.toml
in the [dev-dependencies]
section. These dependencies are not propagated to other packages which depend on this package.
One such example is pretty_assertions
, which extends standard assert_eq!
and assert_ne!
macros, to provide colorful diff.
File Cargo.toml
:
# standard crate data is left out
[dev-dependencies]
pretty_assertions = "1"
File src/lib.rs
:
pub fn add(a: i32, b: i32) -> i32 {
a + b
}
#[cfg(test)]
mod tests {
use super::*;
use pretty_assertions::assert_eq; // crate for test-only use. Cannot be used in non-test code.
#[test]
fn test_add() {
assert_eq!(add(2, 3), 5);
}
}
See Also
Cargo docs on specifying dependencies.
Unsafe Operations
As an introduction to this section, to borrow from the official docs, "one should try to minimize the amount of unsafe code in a code base." With that in mind, let's get started! Unsafe annotations in Rust are used to bypass protections put in place by the compiler; specifically, there are four primary things that unsafe is used for:
- dereferencing raw pointers
- calling functions or methods which are
unsafe
(including calling a function over FFI, see a previous chapter of the book) - accessing or modifying static mutable variables
- implementing unsafe traits
Raw Pointers
Raw pointers *
and references &T
function similarly, but references are always safe because they are guaranteed to point to valid data due to the borrow checker. Dereferencing a raw pointer can only be done through an unsafe block.
fn main() { let raw_p: *const u32 = &10; unsafe { assert!(*raw_p == 10); } }
Calling Unsafe Functions
Some functions can be declared as unsafe
, meaning it is the programmer's responsibility to ensure correctness instead of the compiler's. One example of this is std::slice::from_raw_parts
which will create a slice given a pointer to the first element and a length:
use std::slice; fn main() { let some_vector = vec![1, 2, 3, 4]; let pointer = some_vector.as_ptr(); let length = some_vector.len(); unsafe { let my_slice: &[u32] = slice::from_raw_parts(pointer, length); assert_eq!(some_vector.as_slice(), my_slice); } }
For slice::from_raw_parts
, one of the assumptions which must be upheld is that the pointer passed in points to valid memory and that the memory pointed to is of the correct type. If these invariants aren't upheld then the program's behaviour is undefined and there is no knowing what will happen.
Inline assembly
Rust provides support for inline assembly via the asm!
macro. It can be used to embed handwritten assembly in the assembly output generated by the compiler. Generally this should not be necessary, but might be where the required performance or timing cannot be otherwise achieved. Accessing low level hardware primitives, e.g. in kernel code, may also demand this functionality.
Note: the examples here are given in x86/x86-64 assembly, but other architectures are also supported.
Inline assembly is currently supported on the following architectures:
- x86 and x86-64
- ARM
- AArch64
- RISC-V
Basic usage
Let us start with the simplest possible example:
#![allow(unused)] fn main() { #[cfg(target_arch = "x86_64")] { use std::arch::asm; unsafe { asm!("nop"); } } }
This will insert a NOP (no operation) instruction into the assembly generated by the compiler. Note that all asm!
invocations have to be inside an unsafe
block, as they could insert arbitrary instructions and break various invariants. The instructions to be inserted are listed in the first argument of the asm!
macro as a string literal.
Inputs and outputs
Now inserting an instruction that does nothing is rather boring. Let us do something that actually acts on data:
#![allow(unused)] fn main() { #[cfg(target_arch = "x86_64")] { use std::arch::asm; let x: u64; unsafe { asm!("mov {}, 5", out(reg) x); } assert_eq!(x, 5); } }
This will write the value 5
into the u64
variable x
. You can see that the string literal we use to specify instructions is actually a template string. It is governed by the same rules as Rust format strings. The arguments that are inserted into the template however look a bit different than you may be familiar with. First we need to specify if the variable is an input or an output of the inline assembly. In this case it is an output. We declared this by writing out
. We also need to specify in what kind of register the assembly expects the variable. In this case we put it in an arbitrary general purpose register by specifying reg
. The compiler will choose an appropriate register to insert into the template and will read the variable from there after the inline assembly finishes executing.
Let us see another example that also uses an input:
#![allow(unused)] fn main() { #[cfg(target_arch = "x86_64")] { use std::arch::asm; let i: u64 = 3; let o: u64; unsafe { asm!( "mov {0}, {1}", "add {0}, 5", out(reg) o, in(reg) i, ); } assert_eq!(o, 8); } }
This will add 5
to the input in variable i
and write the result to variable o
. The particular way this assembly does this is first copying the value from i
to the output, and then adding 5
to it.
The example shows a few things:
First, we can see that asm!
allows multiple template string arguments; each one is treated as a separate line of assembly code, as if they were all joined together with newlines between them. This makes it easy to format assembly code.
Second, we can see that inputs are declared by writing in
instead of out
.
Third, we can see that we can specify an argument number, or name as in any format string. For inline assembly templates this is particularly useful as arguments are often used more than once. For more complex inline assembly using this facility is generally recommended, as it improves readability, and allows reordering instructions without changing the argument order.
We can further refine the above example to avoid the mov
instruction:
#![allow(unused)] fn main() { #[cfg(target_arch = "x86_64")] { use std::arch::asm; let mut x: u64 = 3; unsafe { asm!("add {0}, 5", inout(reg) x); } assert_eq!(x, 8); } }
We can see that inout
is used to specify an argument that is both input and output. This is different from specifying an input and output separately in that it is guaranteed to assign both to the same register.
It is also possible to specify different variables for the input and output parts of an inout
operand:
#![allow(unused)] fn main() { #[cfg(target_arch = "x86_64")] { use std::arch::asm; let x: u64 = 3; let y: u64; unsafe { asm!("add {0}, 5", inout(reg) x => y); } assert_eq!(y, 8); } }
Late output operands
The Rust compiler is conservative with its allocation of operands. It is assumed that an out
can be written at any time, and can therefore not share its location with any other argument. However, to guarantee optimal performance it is important to use as few registers as possible, so they won't have to be saved and reloaded around the inline assembly block. To achieve this Rust provides a lateout
specifier. This can be used on any output that is written only after all inputs have been consumed. There is also a inlateout
variant of this specifier.
Here is an example where inlateout
cannot be used in release
mode or other optimized cases:
#![allow(unused)] fn main() { #[cfg(target_arch = "x86_64")] { use std::arch::asm; let mut a: u64 = 4; let b: u64 = 4; let c: u64 = 4; unsafe { asm!( "add {0}, {1}", "add {0}, {2}", inout(reg) a, in(reg) b, in(reg) c, ); } assert_eq!(a, 12); } }
The above could work well in unoptimized cases (Debug
mode), but if you want optimized performance (release
mode or other optimized cases), it could not work.
That is because in optimized cases, the compiler is free to allocate the same register for inputs b
and c
since it knows they have the same value. However it must allocate a separate register for a
since it uses inout
and not inlateout
. If inlateout
was used, then a
and c
could be allocated to the same register, in which case the first instruction to overwrite the value of c
and cause the assembly code to produce the wrong result.
However the following example can use inlateout
since the output is only modified after all input registers have been read:
#![allow(unused)] fn main() { #[cfg(target_arch = "x86_64")] { use std::arch::asm; let mut a: u64 = 4; let b: u64 = 4; unsafe { asm!("add {0}, {1}", inlateout(reg) a, in(reg) b); } assert_eq!(a, 8); } }
As you can see, this assembly fragment will still work correctly if a
and b
are assigned to the same register.
Explicit register operands
Some instructions require that the operands be in a specific register. Therefore, Rust inline assembly provides some more specific constraint specifiers. While reg
is generally available on any architecture, explicit registers are highly architecture specific. E.g. for x86 the general purpose registers eax
, ebx
, ecx
, edx
, ebp
, esi
, and edi
among others can be addressed by their name.
#![allow(unused)] fn main() { #[cfg(target_arch = "x86_64")] { use std::arch::asm; let cmd = 0xd1; unsafe { asm!("out 0x64, eax", in("eax") cmd); } } }
In this example we call the out
instruction to output the content of the cmd
variable to port 0x64
. Since the out
instruction only accepts eax
(and its sub registers) as operand we had to use the eax
constraint specifier.
Note: unlike other operand types, explicit register operands cannot be used in the template string: you can't use
{}
and should write the register name directly instead. Also, they must appear at the end of the operand list after all other operand types.
Consider this example which uses the x86 mul
instruction:
#![allow(unused)] fn main() { #[cfg(target_arch = "x86_64")] { use std::arch::asm; fn mul(a: u64, b: u64) -> u128 { let lo: u64; let hi: u64; unsafe { asm!( // The x86 mul instruction takes rax as an implicit input and writes // the 128-bit result of the multiplication to rax:rdx. "mul {}", in(reg) a, inlateout("rax") b => lo, lateout("rdx") hi ); } ((hi as u128) << 64) + lo as u128 } } }
This uses the mul
instruction to multiply two 64-bit inputs with a 128-bit result. The only explicit operand is a register, that we fill from the variable a
. The second operand is implicit, and must be the rax
register, which we fill from the variable b
. The lower 64 bits of the result are stored in rax
from which we fill the variable lo
. The higher 64 bits are stored in rdx
from which we fill the variable hi
.
Clobbered registers
In many cases inline assembly will modify state that is not needed as an output. Usually this is either because we have to use a scratch register in the assembly or because instructions modify state that we don't need to further examine. This state is generally referred to as being "clobbered". We need to tell the compiler about this since it may need to save and restore this state around the inline assembly block.
use std::arch::asm; #[cfg(target_arch = "x86_64")] fn main() { // three entries of four bytes each let mut name_buf = [0_u8; 12]; // String is stored as ascii in ebx, edx, ecx in order // Because ebx is reserved, the asm needs to preserve the value of it. // So we push and pop it around the main asm. // (in 64 bit mode for 64 bit processors, 32 bit processors would use ebx) unsafe { asm!( "push rbx", "cpuid", "mov [rdi], ebx", "mov [rdi + 4], edx", "mov [rdi + 8], ecx", "pop rbx", // We use a pointer to an array for storing the values to simplify // the Rust code at the cost of a couple more asm instructions // This is more explicit with how the asm works however, as opposed // to explicit register outputs such as `out("ecx") val` // The *pointer itself* is only an input even though it's written behind in("rdi") name_buf.as_mut_ptr(), // select cpuid 0, also specify eax as clobbered inout("eax") 0 => _, // cpuid clobbers these registers too out("ecx") _, out("edx") _, ); } let name = core::str::from_utf8(&name_buf).unwrap(); println!("CPU Manufacturer ID: {}", name); } #[cfg(not(target_arch = "x86_64"))] fn main() {}
In the example above we use the cpuid
instruction to read the CPU manufacturer ID. This instruction writes to eax
with the maximum supported cpuid
argument and ebx
, edx
, and ecx
with the CPU manufacturer ID as ASCII bytes in that order.
Even though eax
is never read we still need to tell the compiler that the register has been modified so that the compiler can save any values that were in these registers before the asm. This is done by declaring it as an output but with _
instead of a variable name, which indicates that the output value is to be discarded.
This code also works around the limitation that ebx
is a reserved register by LLVM. That means that LLVM assumes that it has full control over the register and it must be restored to its original state before exiting the asm block, so it cannot be used as an input or output except if the compiler uses it to fulfill a general register class (e.g. in(reg)
). This makes reg
operands dangerous when using reserved registers as we could unknowingly corrupt our input or output because they share the same register.
To work around this we use rdi
to store the pointer to the output array, save ebx
via push
, read from ebx
inside the asm block into the array and then restore ebx
to its original state via pop
. The push
and pop
use the full 64-bit rbx
version of the register to ensure that the entire register is saved. On 32 bit targets the code would instead use ebx
in the push
/pop
.
This can also be used with a general register class to obtain a scratch register for use inside the asm code:
#![allow(unused)] fn main() { #[cfg(target_arch = "x86_64")] { use std::arch::asm; // Multiply x by 6 using shifts and adds let mut x: u64 = 4; unsafe { asm!( "mov {tmp}, {x}", "shl {tmp}, 1", "shl {x}, 2", "add {x}, {tmp}", x = inout(reg) x, tmp = out(reg) _, ); } assert_eq!(x, 4 * 6); } }
Symbol operands and ABI clobbers
By default, asm!
assumes that any register not specified as an output will have its contents preserved by the assembly code. The clobber_abi
argument to asm!
tells the compiler to automatically insert the necessary clobber operands according to the given calling convention ABI: any register which is not fully preserved in that ABI will be treated as clobbered. Multiple clobber_abi
arguments may be provided and all clobbers from all specified ABIs will be inserted.
#![allow(unused)] fn main() { #[cfg(target_arch = "x86_64")] { use std::arch::asm; extern "C" fn foo(arg: i32) -> i32 { println!("arg = {}", arg); arg * 2 } fn call_foo(arg: i32) -> i32 { unsafe { let result; asm!( "call {}", // Function pointer to call in(reg) foo, // 1st argument in rdi in("rdi") arg, // Return value in rax out("rax") result, // Mark all registers which are not preserved by the "C" calling // convention as clobbered. clobber_abi("C"), ); result } } } }
Register template modifiers
In some cases, fine control is needed over the way a register name is formatted when inserted into the template string. This is needed when an architecture's assembly language has several names for the same register, each typically being a "view" over a subset of the register (e.g. the low 32 bits of a 64-bit register).
By default the compiler will always choose the name that refers to the full register size (e.g. rax
on x86-64, eax
on x86, etc).
This default can be overridden by using modifiers on the template string operands, just like you would with format strings:
#![allow(unused)] fn main() { #[cfg(target_arch = "x86_64")] { use std::arch::asm; let mut x: u16 = 0xab; unsafe { asm!("mov {0:h}, {0:l}", inout(reg_abcd) x); } assert_eq!(x, 0xabab); } }
In this example, we use the reg_abcd
register class to restrict the register allocator to the 4 legacy x86 registers (ax
, bx
, cx
, dx
) of which the first two bytes can be addressed independently.
Let us assume that the register allocator has chosen to allocate x
in the ax
register.
The h
modifier will emit the register name for the high byte of that register and the l
modifier will emit the register name for the low byte. The asm code will therefore be expanded as mov ah, al
which copies the low byte of the value into the high byte.
If you use a smaller data type (e.g. u16
) with an operand and forget to use template modifiers, the compiler will emit a warning and suggest the correct modifier to use.
Memory address operands
Sometimes assembly instructions require operands passed via memory addresses/memory locations. You have to manually use the memory address syntax specified by the target architecture. For example, on x86/x86_64 using Intel assembly syntax, you should wrap inputs/outputs in []
to indicate they are memory operands:
#![allow(unused)] fn main() { #[cfg(target_arch = "x86_64")] { use std::arch::asm; fn load_fpu_control_word(control: u16) { unsafe { asm!("fldcw [{}]", in(reg) &control, options(nostack)); } } } }
Labels
Any reuse of a named label, local or otherwise, can result in an assembler or linker error or may cause other strange behavior. Reuse of a named label can happen in a variety of ways including:
- explicitly: using a label more than once in one
asm!
block, or multiple times across blocks. - implicitly via inlining: the compiler is allowed to instantiate multiple copies of an
asm!
block, for example when the function containing it is inlined in multiple places. - implicitly via LTO: LTO can cause code from other crates to be placed in the same codegen unit, and so could bring in arbitrary labels.
As a consequence, you should only use GNU assembler numeric local labels inside inline assembly code. Defining symbols in assembly code may lead to assembler and/or linker errors due to duplicate symbol definitions.
Moreover, on x86 when using the default Intel syntax, due to an LLVM bug, you shouldn't use labels exclusively made of 0
and 1
digits, e.g. 0
, 11
or 101010
, as they may end up being interpreted as binary values. Using options(att_syntax)
will avoid any ambiguity, but that affects the syntax of the entire asm!
block. (See Options, below, for more on options
.)
#![allow(unused)] fn main() { #[cfg(target_arch = "x86_64")] { use std::arch::asm; let mut a = 0; unsafe { asm!( "mov {0}, 10", "2:", "sub {0}, 1", "cmp {0}, 3", "jle 2f", "jmp 2b", "2:", "add {0}, 2", out(reg) a ); } assert_eq!(a, 5); } }
This will decrement the {0}
register value from 10 to 3, then add 2 and store it in a
.
This example shows a few things:
- First, that the same number can be used as a label multiple times in the same inline block.
- Second, that when a numeric label is used as a reference (as an instruction operand, for example), the suffixes “b” (“backward”) or ”f” (“forward”) should be added to the numeric label. It will then refer to the nearest label defined by this number in this direction.
Options
By default, an inline assembly block is treated the same way as an external FFI function call with a custom calling convention: it may read/write memory, have observable side effects, etc. However, in many cases it is desirable to give the compiler more information about what the assembly code is actually doing so that it can optimize better.
Let's take our previous example of an add
instruction:
#![allow(unused)] fn main() { #[cfg(target_arch = "x86_64")] { use std::arch::asm; let mut a: u64 = 4; let b: u64 = 4; unsafe { asm!( "add {0}, {1}", inlateout(reg) a, in(reg) b, options(pure, nomem, nostack), ); } assert_eq!(a, 8); } }
Options can be provided as an optional final argument to the asm!
macro. We specified three options here:
pure
means that the asm code has no observable side effects and that its output depends only on its inputs. This allows the compiler optimizer to call the inline asm fewer times or even eliminate it entirely.nomem
means that the asm code does not read or write to memory. By default the compiler will assume that inline assembly can read or write any memory address that is accessible to it (e.g. through a pointer passed as an operand, or a global).nostack
means that the asm code does not push any data onto the stack. This allows the compiler to use optimizations such as the stack red zone on x86-64 to avoid stack pointer adjustments.
These allow the compiler to better optimize code using asm!
, for example by eliminating pure asm!
blocks whose outputs are not needed.
See the reference for the full list of available options and their effects.
Compatibility
The Rust language is fastly evolving, and because of this certain compatibility issues can arise, despite efforts to ensure forwards-compatibility wherever possible.
Raw identifiers
Rust, like many programming languages, has the concept of "keywords". These identifiers mean something to the language, and so you cannot use them in places like variable names, function names, and other places. Raw identifiers let you use keywords where they would not normally be allowed. This is particularly useful when Rust introduces new keywords, and a library using an older edition of Rust has a variable or function with the same name as a keyword introduced in a newer edition.
For example, consider a crate foo
compiled with the 2015 edition of Rust that exports a function named try
. This keyword is reserved for a new feature in the 2018 edition, so without raw identifiers, we would have no way to name the function.
extern crate foo;
fn main() {
foo::try();
}
You'll get this error:
error: expected identifier, found keyword `try`
--> src/main.rs:4:4
|
4 | foo::try();
| ^^^ expected identifier, found keyword
You can write this with a raw identifier:
extern crate foo;
fn main() {
foo::r#try();
}
Meta
Some topics aren't exactly relevant to how you program but provide you tooling or infrastructure support which just makes things better for everyone. These topics include:
- Documentation: Generate library documentation for users via the included
rustdoc
. - Playground: Integrate the Rust Playground in your documentation.
Documentation
Use cargo doc
to build documentation in target/doc
.
Use cargo test
to run all tests (including documentation tests), and cargo test --doc
to only run documentation tests.
These commands will appropriately invoke rustdoc
(and rustc
) as required.
Doc comments
Doc comments are very useful for big projects that require documentation. When running rustdoc
, these are the comments that get compiled into documentation. They are denoted by a ///
, and support Markdown.
#![crate_name = "doc"]
/// A human being is represented here
pub struct Person {
/// A person must have a name, no matter how much Juliet may hate it
name: String,
}
impl Person {
/// Returns a person with the name given them
///
/// # Arguments
///
/// * `name` - A string slice that holds the name of the person
///
/// # Examples
///
/// ```
/// // You can have rust code between fences inside the comments
/// // If you pass --test to `rustdoc`, it will even test it for you!
/// use doc::Person;
/// let person = Person::new("name");
/// ```
pub fn new(name: &str) -> Person {
Person {
name: name.to_string(),
}
}
/// Gives a friendly hello!
///
/// Says "Hello, [name](Person::name)" to the `Person` it is called on.
pub fn hello(& self) {
println!("Hello, {}!", self.name);
}
}
fn main() {
let john = Person::new("John");
john.hello();
}
To run the tests, first build the code as a library, then tell rustdoc
where to find the library so it can link it into each doctest program:
$ rustc doc.rs --crate-type lib
$ rustdoc --test --extern doc="libdoc.rlib" doc.rs
Doc attributes
Below are a few examples of the most common #[doc]
attributes used with rustdoc
.
inline
Used to inline docs, instead of linking out to separate page.
#[doc(inline)]
pub use bar::Bar;
/// bar docs
mod bar {
/// the docs for Bar
pub struct Bar;
}
no_inline
Used to prevent linking out to separate page or anywhere.
// Example from libcore/prelude
#[doc(no_inline)]
pub use crate::mem::drop;
hidden
Using this tells rustdoc
not to include this in documentation:
// Example from the futures-rs library
#[doc(hidden)]
pub use self::async_await::*;
For documentation, rustdoc
is widely used by the community. It's what is used to generate the std library docs.
See also:
- The Rust Book: Making Useful Documentation Comments
- The rustdoc Book
- The Reference: Doc comments
- RFC 1574: API Documentation Conventions
- RFC 1946: Relative links to other items from doc comments (intra-rustdoc links)
- Is there any documentation style guide for comments? (reddit)
Playground
The Rust Playground is a way to experiment with Rust code through a web interface.
Using it with mdbook
In mdbook
, you can make code examples playable and editable.
fn main() { println!("Hello World!"); }
This allows the reader to both run your code sample, but also modify and tweak it. The key here is the adding the word editable
to your codefence block separated by a comma.
```rust,editable
//...place your code here
```
Additionally, you can add ignore
if you want mdbook
to skip your code when it builds and tests.
```rust,editable,ignore
//...place your code here
```
Using it with docs
You may have noticed in some of the official Rust docs a button that says "Run", which opens the code sample up in a new tab in Rust Playground. This feature is enabled if you use the #[doc] attribute called html_playground_url
.