1// Copyright (C) 2020 Klarälvdalens Datakonsult AB, a KDAB Group company, info@kdab.com, author Marc Mutz <marc.mutz@kdab.com>
2// SPDX-License-Identifier: LicenseRef-Qt-Commercial OR GFDL-1.3-no-invariants-only
8 \brief The QUtf8StringView class provides a unified view on UTF-8 strings
9 with a read-only subset of the QString API.
12 \ingroup string-processing
14 A QUtf8StringView references a contiguous portion of a UTF-8
15 string it does not own. It acts as an interface type to all kinds
16 of UTF-8 string, without the need to construct a QString or
19 The UTF-8 string may be represented as an array (or an
20 array-compatible data-structure such as std::basic_string, etc.)
21 of \c char8_t, \c char, \c{signed char} or \c{unsigned char}.
23 QUtf8StringView is designed as an interface type; its main
24 use-case is as a function parameter type. When QUtf8StringViews
25 are used as automatic variables or data members, care must be
26 taken to ensure that the referenced string data (for example,
27 owned by a std::u8string) outlives the QUtf8StringView on all code
28 paths, lest the string view ends up referencing deleted data.
30 When used as an interface type, QUtf8StringView allows a single
31 function to accept a wide variety of UTF-8 string data
32 sources. One function accepting QUtf8StringView thus replaces
33 several function overloads (taking e.g. QByteArray), while at the
34 same time enabling even more string data sources to be passed to
35 the function, such as \c{u8"Hello World"}, a \c char8_t (C++20) or
36 \c char (C++17) string literal. The \c char8_t incompatibility
37 between C++17 and C++20 goes away when using QUtf8StringView.
39 Like all views, QUtf8StringViews should be passed by value, not by
41 \snippet code/src_corelib_text_qutf8stringview.cpp 0
43 If you want to give your users maximum freedom in what strings
44 they can pass to your function, consider using QAnyStringView
47 QUtf8StringView can also be used as the return value of a
48 function. If you call a function returning QUtf8StringView, take
49 extra care to not keep the QUtf8StringView around longer than the
50 function promises to keep the referenced string data alive. If in
51 doubt, obtain a strong reference to the data by calling toString()
52 to convert the QUtf8StringView into a QString.
54 QUtf8StringView is a \e{Literal Type}.
56 \section2 Compatible Character Types
58 QUtf8StringView accepts strings over a variety of character types:
61 \li \c char (both signed and unsigned)
62 \li \c char8_t (C++20 only)
65 \section2 Sizes and Sub-Strings
67 All sizes and positions in QUtf8StringView functions are in
68 UTF-8 code points (that is, UTF-8 multibyte sequences count as
69 two, three or four, depending on their length). QUtf8StringView
70 does not an attempt to detect or prevent slicing right through
71 UTF-8 multibyte sequences. This is similar to the situation with
72 QStringView and surrogate pairs.
74 \section2 C++20, char8_t, and QUtf8StringView
76 In C++20, \c{u8""} string literals changed their type from
77 \c{const char[]} to \c{const char8_t[]}. If Qt 6 could have depended
78 on C++20, QUtf8StringView would store \c char8_t natively, and the
79 following functions and aliases would use (pointers to) \c char8_t:
82 \li storage_type, value_type, etc
83 \li begin(), end(), data(), etc
84 \li front(), back(), at(), operator[]()
87 This is what QUtf8StringView is expected to look like in Qt 7, but for
88 Qt 6, this was not possible. Instead of locking users into a C++17-era
89 interface for the next decade, Qt provides two QUtf8StringView classes,
90 in different (inline) namespaces. The first, in namespace \c{q_no_char8_t},
91 has a value_type of \c{const char} and is universally available.
92 The second, in namespace \c{q_has_char8_t}, has a value_type of
93 \c{const char8_t} and is only available when compiling in C++20 mode.
95 \c{q_no_char8_t} is an inline namespace regardless of C++ edition, to avoid
96 accidental binary incompatibilities. To use the \c{char8_t} version, you
97 need to name it explicitly with \c{q_has_char8_t::QUtf8StringView}.
99 Internally, both are instantiations of the same template class,
100 QBasicUtf8StringView. Please do not use the template class's name in your
103 \sa QAnyStringView, QUtf8StringView, QString
107 \typedef QUtf8StringView::storage_type
113 \typedef QUtf8StringView::value_type
115 Alias for \c{const char}. Provided for compatibility with the STL.
119 \typedef QUtf8StringView::difference_type
121 Alias for \c{std::ptrdiff_t}. Provided for compatibility with the STL.
125 \typedef QUtf8StringView::size_type
127 Alias for qsizetype. Provided for compatibility with the STL.
131 \typedef QUtf8StringView::reference
133 Alias for \c{value_type &}. Provided for compatibility with the STL.
135 QUtf8StringView does not support mutable references, so this is the same
140 \typedef QUtf8StringView::const_reference
142 Alias for \c{value_type &}. Provided for compatibility with the STL.
146 \typedef QUtf8StringView::pointer
148 Alias for \c{value_type *}. Provided for compatibility with the STL.
150 QUtf8StringView does not support mutable pointers, so this is the same
155 \typedef QUtf8StringView::const_pointer
157 Alias for \c{value_type *}. Provided for compatibility with the STL.
161 \typedef QUtf8StringView::iterator
163 This typedef provides an STL-style const iterator for QUtf8StringView.
165 QUtf8StringView does not support mutable iterators, so this is the same
168 \sa const_iterator, reverse_iterator
172 \typedef QUtf8StringView::const_iterator
174 This typedef provides an STL-style const iterator for QUtf8StringView.
176 \sa iterator, const_reverse_iterator
180 \typedef QUtf8StringView::reverse_iterator
182 This typedef provides an STL-style const reverse iterator for QUtf8StringView.
184 QUtf8StringView does not support mutable reverse iterators, so this is the
185 same as const_reverse_iterator.
187 \sa const_reverse_iterator, iterator
191 \typedef QUtf8StringView::const_reverse_iterator
193 This typedef provides an STL-style const reverse iterator for QUtf8StringView.
195 \sa reverse_iterator, const_iterator
199 \fn QUtf8StringView::QUtf8StringView()
201 Constructs a null string view.
207 \fn QUtf8StringView::QUtf8StringView(const storage_type *d, qsizetype n)
212 \fn QUtf8StringView::QUtf8StringView(std::nullptr_t)
214 Constructs a null string view.
220 \fn template <typename Char> QUtf8StringView::QUtf8StringView(const Char *str, qsizetype len)
222 Constructs a string view on \a str with length \a len.
224 The range \c{[str,len)} must remain valid for the lifetime of this string view object.
226 Passing \nullptr as \a str is safe if \a len is 0, too, and results in a null string view.
228 The behavior is undefined if \a len is negative or, when positive, if \a str is \nullptr.
230 This constructor only participates in overload resolution if \c Char is a compatible
231 character type. The compatible character types are: \c char8_t, \c char, \c{signed char} and
236 \fn template <typename Char> QUtf8StringView::QUtf8StringView(const Char *first, const Char *last)
238 Constructs a string view on \a first with length (\a last - \a first).
240 The range \c{[first,last)} must remain valid for the lifetime of
241 this string view object.
243 Passing \c \nullptr as \a first is safe if \a last is \nullptr, too,
244 and results in a null string view.
246 The behavior is undefined if \a last precedes \a first, or \a first
247 is \nullptr and \a last is not.
249 This constructor only participates in overload resolution if \c Char is a compatible
250 character type. The compatible character types are: \c char8_t, \c char, \c{signed char} and
255 \fn template <typename Char> QUtf8StringView::QUtf8StringView(const Char *str)
257 Constructs a string view on \a str. The length is determined
258 by scanning for the first \c{Char(0)}.
260 \a str must remain valid for the lifetime of this string view object.
262 Passing \nullptr as \a str is safe and results in a null string view.
264 This constructor only participates in overload resolution if \a str
265 is not an array and if \c Char is a compatible character type. The
266 compatible character types are: \c char8_t, \c char, \c{signed char} and
271 \fn template <typename Char, size_t N> QUtf8StringView::QUtf8StringView(const Char (&string)[N])
273 Constructs a string view on the character string literal \a string.
274 The view covers the array until the first \c{Char(0)} is encountered,
275 or \c N, whichever comes first.
276 If you need the full array, use fromArray() instead.
278 \a string must remain valid for the lifetime of this string view
281 This constructor only participates in overload resolution if \a string
282 is an actual array and if \c Char is a compatible character type. The
283 compatible character types are: \c char8_t, \c char, \c{signed char} and
290 \fn template <typename Container, if_compatible_container<Container>> QUtf8StringView::QUtf8StringView(const Container &str)
292 Constructs a string view on \a str. The length is taken from \c{std::size(str)}.
294 \c{std::data(str)} must remain valid for the lifetime of this string view object.
296 This constructor only participates in overload resolution if \c Container is a
297 container with a compatible character type as \c{value_type}. The
298 compatible character types are: \c char8_t, \c char, \c{signed char} and
301 The string view will be empty if and only if \c{std::size(str) == 0}. It is unspecified
302 whether this constructor can result in a null string view (\c{std::data(str)} would
303 have to return \nullptr for this).
305 \sa isNull(), isEmpty()
309 \fn template <typename Char, size_t Size, if_compatible_char<Char>> QUtf8StringView::fromArray(const Char (&string)[Size])
311 Constructs a string view on the full character string literal \a string,
312 including any trailing \c{Char(0)}. If you don't want the
313 null-terminator included in the view then you can chop() it off
314 when you are certain it is at the end. Alternatively you can use
315 the constructor overload taking an array literal which will create
316 a view up to, but not including, the first null-terminator in the data.
318 \a string must remain valid for the lifetime of this string view
321 This function will work with any array literal if \c Char is a
322 compatible character type. The compatible character types
323 are: \c char8_t, \c char, \c{signed char} and \c{unsigned char}.
327 \fn QString QUtf8StringView::toString() const
329 Returns a deep copy of this string view's data as a QString.
331 The return value will be a null QString if and only if this string view is null.
335 \fn QUtf8StringView::data() const
337 Returns a const pointer to the first code point in the string view.
339 \note The character array represented by the return value is \e not null-terminated.
341 \sa begin(), end(), utf8()
345 \fn QUtf8StringView::utf8() const
347 Returns a const pointer to the first code point in the string view.
349 The result is returned as a \c{const char8_t*}, so this function is only available when
350 compiling in C++20 mode.
352 \note The character array represented by the return value is \e not null-terminated.
354 \sa begin(), end(), data()
358 \fn QUtf8StringView::const_iterator QUtf8StringView::begin() const
360 Returns a const \l{STL-style iterators}{STL-style iterator} pointing to the first code point in
363 This function is provided for STL compatibility.
365 \sa end(), cbegin(), rbegin(), data()
369 \fn QUtf8StringView::const_iterator QUtf8StringView::cbegin() const
373 This function is provided for STL compatibility.
375 \sa cend(), begin(), crbegin(), data()
379 \fn QUtf8StringView::const_iterator QUtf8StringView::end() const
381 Returns a const \l{STL-style iterators}{STL-style iterator} pointing to the imaginary
382 code point after the last code point in the list.
384 This function is provided for STL compatibility.
386 \sa begin(), cend(), rend()
389/*! \fn QUtf8StringView::const_iterator QUtf8StringView::cend() const
393 This function is provided for STL compatibility.
395 \sa cbegin(), end(), crend()
399 \fn QUtf8StringView::const_reverse_iterator QUtf8StringView::rbegin() const
401 Returns a const \l{STL-style iterators}{STL-style} reverse iterator pointing to the first
402 code point in the string view, in reverse order.
404 This function is provided for STL compatibility.
406 \sa rend(), crbegin(), begin()
410 \fn QUtf8StringView::const_reverse_iterator QUtf8StringView::crbegin() const
414 This function is provided for STL compatibility.
416 \sa crend(), rbegin(), cbegin()
420 \fn QUtf8StringView::const_reverse_iterator QUtf8StringView::rend() const
422 Returns a \l{STL-style iterators}{STL-style} reverse iterator pointing to one past
423 the last code point in the string view, in reverse order.
425 This function is provided for STL compatibility.
427 \sa rbegin(), crend(), end()
431 \fn QUtf8StringView::const_reverse_iterator QUtf8StringView::crend() const
435 This function is provided for STL compatibility.
437 \sa crbegin(), rend(), cend()
441 \fn bool QUtf8StringView::empty() const
443 Returns whether this string view is empty - that is, whether \c{size() == 0}.
445 This function is provided for STL compatibility.
447 \sa isEmpty(), isNull(), size(), length()
451 \fn bool QUtf8StringView::isEmpty() const
453 Returns whether this string view is empty - that is, whether \c{size() == 0}.
455 This function is provided for compatibility with other Qt containers.
457 \sa empty(), isNull(), size(), length()
461 \fn bool QUtf8StringView::isNull() const
463 Returns whether this string view is null - that is, whether \c{data() == nullptr}.
465 This functions is provided for compatibility with other Qt containers.
467 \sa empty(), isEmpty(), size(), length()
471 \fn qsizetype QUtf8StringView::size() const
473 Returns the size of this string view, in UTF-8 code points (that is,
474 multi-byte sequences count as more than one for the purposes of this function, the same
475 as surrogate pairs in QString and QStringView).
477 \sa empty(), isEmpty(), isNull(), length()
481 \fn QUtf8StringView::length() const
485 This function is provided for compatibility with other Qt containers.
487 \sa empty(), isEmpty(), isNull(), size()
491 \fn QUtf8StringView::operator[](qsizetype n) const
493 Returns the code point at position \a n in this string view.
495 The behavior is undefined if \a n is negative or not less than size().
497 \sa at(), front(), back()
501 \fn QUtf8StringView::at(qsizetype n) const
503 Returns the code point at position \a n in this string view.
505 The behavior is undefined if \a n is negative or not less than size().
507 \sa operator[](), front(), back()
511 \fn QUtf8StringView::front() const
513 Returns the first code point in the string view. Same as first().
515 This function is provided for STL compatibility.
517 \warning Calling this function on an empty string view constitutes
524 \fn QUtf8StringView::back() const
526 Returns the last code point in the string view. Same as last().
528 This function is provided for STL compatibility.
530 \warning Calling this function on an empty string view constitutes
537 \fn QUtf8StringView::mid(qsizetype pos, qsizetype n) const
539 Returns the substring of length \a n starting at position
540 \a pos in this object.
542 \deprecated Use sliced() instead in new code.
544 Returns an empty string view if \a n exceeds the
545 length of the string view. If there are less than \a n code points
546 available in the string view starting at \a pos, or if
547 \a n is negative (default), the function returns all code points that
548 are available from \a pos.
550 \sa first(), last(), sliced(), chopped(), chop(), truncate()
554 \fn QUtf8StringView::left(qsizetype n) const
556 \deprecated Use first() instead in new code.
558 Returns the substring of length \a n starting at position
561 The entire string view is returned if \a n is greater than or equal
562 to size(), or less than zero.
564 \sa first(), last(), sliced(), chopped(), chop(), truncate()
568 \fn QUtf8StringView::right(qsizetype n) const
570 \deprecated Use last() instead in new code.
572 Returns the substring of length \a n starting at position
573 size() - \a n in this object.
575 The entire string view is returned if \a n is greater than or equal
576 to size(), or less than zero.
578 \sa first(), last(), sliced(), chopped(), chop(), truncate()
582 \fn QUtf8StringView::first(qsizetype n) const
584 Returns a string view that contains the first \a n code points
587 \note The behavior is undefined when \a n < 0 or \a n > size().
589 \sa last(), sliced(), chopped(), chop(), truncate()
593 \fn QUtf8StringView::last(qsizetype n) const
595 Returns a string view that contains the last \a n code points of this string view.
597 \note The behavior is undefined when \a n < 0 or \a n > size().
599 \sa first(), sliced(), chopped(), chop(), truncate()
603 \fn QUtf8StringView::sliced(qsizetype pos, qsizetype n) const
605 Returns a string view containing \a n code points of this string view,
606 starting at position \a pos.
608 \note The behavior is undefined when \a pos < 0, \a n < 0,
609 or \a pos + \a n > size().
611 \sa first(), last(), chopped(), chop(), truncate()
615 \fn QUtf8StringView::sliced(qsizetype pos) const
617 Returns a string view starting at position \a pos in this object,
618 and extending to its end.
620 \note The behavior is undefined when \a pos < 0 or \a pos > size().
622 \sa first(), last(), chopped(), chop(), truncate()
626 \fn QUtf8StringView::chopped(qsizetype n) const
628 Returns the substring of length size() - \a n starting at the
629 beginning of this object.
631 Same as \c{first(size() - n)}.
633 \note The behavior is undefined when \a n < 0 or \a n > size().
635 \sa sliced(), first(), last(), chop(), truncate()
639 \fn QUtf8StringView::truncate(qsizetype n)
641 Truncates this string view to \a n code points.
643 Same as \c{*this = first(n)}.
645 \note The behavior is undefined when \a n < 0 or \a n > size().
647 \sa sliced(), first(), last(), chopped(), chop()
651 \fn QUtf8StringView::chop(qsizetype n)
653 Truncates this string view by \a n code points.
655 Same as \c{*this = first(size() - n)}.
657 \note The behavior is undefined when \a n < 0 or \a n > size().
659 \sa sliced(), first(), last(), chopped(), truncate()
663 \fn int QUtf8StringView::compare(QLatin1StringView str, Qt::CaseSensitivity cs) const
664 \fn int QUtf8StringView::compare(QUtf8StringView str, Qt::CaseSensitivity cs) const
665 \fn int QUtf8StringView::compare(QStringView str, Qt::CaseSensitivity cs) const
668 Returns an integer that compares to zero as this string view compares to the
671 \include qstring.qdocinc {search-comparison-case-sensitivity} {comparison}
676 \fn QUtf8StringView::isValidUtf8() const
678 Returns \c true if this string contains valid UTF-8 encoded data,
679 or \c false otherwise.
685 \fn template <typename QStringLike> qToUtf8StringViewIgnoringNull(const QStringLike &s);
686 \relates QUtf8StringView
689 Convert \a s to a QUtf8StringView ignoring \c{s.isNull()}.
691 Returns a string view that references \a{s}'s data, but is never null.
693 This is a faster way to convert a QByteArray to a QUtf8StringView,
694 if null QByteArrays can legitimately be treated as empty ones.
696 \sa QByteArray::isNull(), QUtf8StringView