sort                  package:base                  R Documentation

_S_o_r_t_i_n_g _o_r _O_r_d_e_r_i_n_g _V_e_c_t_o_r_s

_D_e_s_c_r_i_p_t_i_o_n:

     Sort (or _order_) a vector or factor (partially) into ascending
     (or descending) order.

_U_s_a_g_e:

     sort(x, partial = NULL, na.last = NA, decreasing = FALSE,
          method = c("shell", "quick"), index.return = FALSE)

     is.unsorted(x, na.rm = FALSE)

_A_r_g_u_m_e_n_t_s:

       x: a numeric, complex, character or logical vector, or a factor.

 partial: a vector of indices for partial sorting.

 na.last: for controlling the treatment of 'NA's. If 'TRUE', missing
          values in the data are put last; if 'FALSE', they are put
          first; if 'NA', they are removed.

decreasing: logical.  Should the sort be increasing or decreasing? Not
          available for partial sorting.

  method: character specifying the algorithm used.

index.return: logical indicating if the ordering index vector should be
          returned as well; this is only available for a few cases, the
          default 'na.last = NA' and full sorting of non-factors.

   na.rm: logical.  Should missing values be removed?

_D_e_t_a_i_l_s:

     If 'partial' is not 'NULL', it is taken to contain indices of
     elements of 'x' which are to be placed in their correct positions
     by partial sorting.  After the sort, the values specified in
     'partial' are in their correct position in the sorted array.  Any
     values smaller than these values are guaranteed to have a smaller
     index in the sorted array and any values which are greater are
     guaranteed to have a bigger index in the sorted array.  This is
     included for efficiency, and many of the options are not available
     for partial sorting.

     The sort order for character vectors will depend on the collating
     sequence of the locale in use: see 'Comparison'.

     'is.unsorted' returns a logical indicating if 'x' is sorted
     increasingly, i.e., 'is.unsorted(x)' is true if 'any(x !=
     sort(x))' (and there are no 'NA's).

     Method '"shell"' uses Shellsort (an O(n^{4/3}) variant from
     Sedgewick (1996)).  If 'x' has names a stable sort is used, so
     ties are not reordered.  (This only matters if names are present.)

     Method '"quick"' uses Singleton's Quicksort implementation and is
     only available when 'x' is numeric (double or integer) and
     'partial' is 'NULL'.  It is normally somewhat faster than
     Shellsort (perhaps twice as fast on vectors of length a million)
     but has poor performance in the rare worst case. (Peto's
     modification using a pseudo-random midpoint is used to make the
     worst case rarer.)  This is not a stable sort, and ties may be
     reordered.

_V_a_l_u_e:

     For 'sort' the sorted vector unless 'index.return' is true, when
     the result is a list with components named 'x' and 'ix' containing
     the sorted numbers and the ordering index vector.  In the latter
     case, if 'method == "quick"' ties may be reversed in the ordering,
     unlike 'sort.list', as quicksort is not stable.

_R_e_f_e_r_e_n_c_e_s:

     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S
     Language_. Wadsworth & Brooks/Cole.

     Sedgewick, R. (1986) A new upper bound for Shell sort. _J.
     Algorithms_ *7*, 159-173.

     Singleton, R. C. (1969) An efficient algorithm for sorting with
     minimal storage: Algorithm 347. _Communications of the ACM_ *12*,
     185-187.

_S_e_e _A_l_s_o:

     'order', 'rank'.

_E_x_a_m_p_l_e_s:

     require(stats)
     x <- swiss$Education[1:25]
     x; sort(x); sort(x, partial = c(10, 15))
     median # shows you another example for 'partial'

     ## illustrate 'stable' sorting (of ties):
     sort(c(10:3,2:12), method = "sh", index=TRUE) # is stable
     ## $x : 2  3  3  4  4  5  5  6  6  7  7  8  8  9  9 10 10 11 12
     ## $ix: 9  8 10  7 11  6 12  5 13  4 14  3 15  2 16  1 17 18 19
     sort(c(10:3,2:12), method = "qu", index=TRUE) # is not
     ## $x : 2  3  3  4  4  5  5  6  6  7  7  8  8  9  9 10 10 11 12
     ## $ix: 9 10  8  7 11  6 12  5 13  4 14  3 15 16  2 17  1 18 19
     ##        ^^^^^

     ## Not run: ## Small speed comparison simulation:
     N <- 2000
     Sim <- 20
     rep <- 50 # << adjust to your CPU
     c1 <- c2 <- numeric(Sim)
     for(is in 1:Sim){
       x <- rnorm(N)
       gc()  ## sort should not have to pay for gc
       c1[is] <- system.time(for(i in 1:rep) sort(x, method = "shell"))[1]
       c2[is] <- system.time(for(i in 1:rep) sort(x, method = "quick"))[1]
       stopifnot(sort(x, meth = "s") == sort(x, meth = "q"))
     }
     100 * rbind(ShellSort = c1, QuickSort = c2)
     cat("Speedup factor of quick sort():\n")
     summary({qq <- c1 / c2; qq[is.finite(qq)]})

     ## A larger test
     x <- rnorm(1e6)
     gc()
     system.time(x1 <- sort(x, method = "shell"))
     gc()
     system.time(x2 <- sort(x, method = "quick"))
     stopifnot(identical(x1, x2))
     ## End(Not run)

