Qt 6.x
The Qt SDK
Loading...
Searching...
No Matches
qutf8stringview.qdoc
Go to the documentation of this file.
1// Copyright (C) 2020 Klarälvdalens Datakonsult AB, a KDAB Group company, info@kdab.com, author Marc Mutz <marc.mutz@kdab.com>
2// SPDX-License-Identifier: LicenseRef-Qt-Commercial OR GFDL-1.3-no-invariants-only
3
4/*!
5 \class QUtf8StringView
6 \inmodule QtCore
7 \since 6.0
8 \brief The QUtf8StringView class provides a unified view on UTF-8 strings
9 with a read-only subset of the QString API.
10 \reentrant
11 \ingroup tools
12 \ingroup string-processing
13
14 A QUtf8StringView references a contiguous portion of a UTF-8
15 string it does not own. It acts as an interface type to all kinds
16 of UTF-8 string, without the need to construct a QString or
17 QByteArray first.
18
19 The UTF-8 string may be represented as an array (or an
20 array-compatible data-structure such as std::basic_string, etc.)
21 of \c char8_t, \c char, \c{signed char} or \c{unsigned char}.
22
23 QUtf8StringView is designed as an interface type; its main
24 use-case is as a function parameter type. When QUtf8StringViews
25 are used as automatic variables or data members, care must be
26 taken to ensure that the referenced string data (for example,
27 owned by a std::u8string) outlives the QUtf8StringView on all code
28 paths, lest the string view ends up referencing deleted data.
29
30 When used as an interface type, QUtf8StringView allows a single
31 function to accept a wide variety of UTF-8 string data
32 sources. One function accepting QUtf8StringView thus replaces
33 several function overloads (taking e.g. QByteArray), while at the
34 same time enabling even more string data sources to be passed to
35 the function, such as \c{u8"Hello World"}, a \c char8_t (C++20) or
36 \c char (C++17) string literal. The \c char8_t incompatibility
37 between C++17 and C++20 goes away when using QUtf8StringView.
38
39 Like all views, QUtf8StringViews should be passed by value, not by
40 reference-to-const:
41 \snippet code/src_corelib_text_qutf8stringview.cpp 0
42
43 If you want to give your users maximum freedom in what strings
44 they can pass to your function, consider using QAnyStringView
45 instead.
46
47 QUtf8StringView can also be used as the return value of a
48 function. If you call a function returning QUtf8StringView, take
49 extra care to not keep the QUtf8StringView around longer than the
50 function promises to keep the referenced string data alive. If in
51 doubt, obtain a strong reference to the data by calling toString()
52 to convert the QUtf8StringView into a QString.
53
54 QUtf8StringView is a \e{Literal Type}.
55
56 \section2 Compatible Character Types
57
58 QUtf8StringView accepts strings over a variety of character types:
59
60 \list
61 \li \c char (both signed and unsigned)
62 \li \c char8_t (C++20 only)
63 \endlist
64
65 \section2 Sizes and Sub-Strings
66
67 All sizes and positions in QUtf8StringView functions are in
68 UTF-8 code points (that is, UTF-8 multibyte sequences count as
69 two, three or four, depending on their length). QUtf8StringView
70 does not an attempt to detect or prevent slicing right through
71 UTF-8 multibyte sequences. This is similar to the situation with
72 QStringView and surrogate pairs.
73
74 \section2 C++20, char8_t, and QUtf8StringView
75
76 In C++20, \c{u8""} string literals changed their type from
77 \c{const char[]} to \c{const char8_t[]}. If Qt 6 could have depended
78 on C++20, QUtf8StringView would store \c char8_t natively, and the
79 following functions and aliases would use (pointers to) \c char8_t:
80
81 \list
82 \li storage_type, value_type, etc
83 \li begin(), end(), data(), etc
84 \li front(), back(), at(), operator[]()
85 \endlist
86
87 This is what QUtf8StringView is expected to look like in Qt 7, but for
88 Qt 6, this was not possible. Instead of locking users into a C++17-era
89 interface for the next decade, Qt provides two QUtf8StringView classes,
90 in different (inline) namespaces. The first, in namespace \c{q_no_char8_t},
91 has a value_type of \c{const char} and is universally available.
92 The second, in namespace \c{q_has_char8_t}, has a value_type of
93 \c{const char8_t} and is only available when compiling in C++20 mode.
94
95 \c{q_no_char8_t} is an inline namespace regardless of C++ edition, to avoid
96 accidental binary incompatibilities. To use the \c{char8_t} version, you
97 need to name it explicitly with \c{q_has_char8_t::QUtf8StringView}.
98
99 Internally, both are instantiations of the same template class,
100 QBasicUtf8StringView. Please do not use the template class's name in your
101 source code.
102
103 \sa QAnyStringView, QUtf8StringView, QString
104*/
105
106/*!
107 \typedef QUtf8StringView::storage_type
108
109 Alias for \c{char}.
110*/
111
112/*!
113 \typedef QUtf8StringView::value_type
114
115 Alias for \c{const char}. Provided for compatibility with the STL.
116*/
117
118/*!
119 \typedef QUtf8StringView::difference_type
120
121 Alias for \c{std::ptrdiff_t}. Provided for compatibility with the STL.
122*/
123
124/*!
125 \typedef QUtf8StringView::size_type
126
127 Alias for qsizetype. Provided for compatibility with the STL.
128*/
129
130/*!
131 \typedef QUtf8StringView::reference
132
133 Alias for \c{value_type &}. Provided for compatibility with the STL.
134
135 QUtf8StringView does not support mutable references, so this is the same
136 as const_reference.
137*/
138
139/*!
140 \typedef QUtf8StringView::const_reference
141
142 Alias for \c{value_type &}. Provided for compatibility with the STL.
143*/
144
145/*!
146 \typedef QUtf8StringView::pointer
147
148 Alias for \c{value_type *}. Provided for compatibility with the STL.
149
150 QUtf8StringView does not support mutable pointers, so this is the same
151 as const_pointer.
152*/
153
154/*!
155 \typedef QUtf8StringView::const_pointer
156
157 Alias for \c{value_type *}. Provided for compatibility with the STL.
158*/
159
160/*!
161 \typedef QUtf8StringView::iterator
162
163 This typedef provides an STL-style const iterator for QUtf8StringView.
164
165 QUtf8StringView does not support mutable iterators, so this is the same
166 as const_iterator.
167
168 \sa const_iterator, reverse_iterator
169*/
170
171/*!
172 \typedef QUtf8StringView::const_iterator
173
174 This typedef provides an STL-style const iterator for QUtf8StringView.
175
176 \sa iterator, const_reverse_iterator
177*/
178
179/*!
180 \typedef QUtf8StringView::reverse_iterator
181
182 This typedef provides an STL-style const reverse iterator for QUtf8StringView.
183
184 QUtf8StringView does not support mutable reverse iterators, so this is the
185 same as const_reverse_iterator.
186
187 \sa const_reverse_iterator, iterator
188*/
189
190/*!
191 \typedef QUtf8StringView::const_reverse_iterator
192
193 This typedef provides an STL-style const reverse iterator for QUtf8StringView.
194
195 \sa reverse_iterator, const_iterator
196*/
197
198/*!
199 \fn QUtf8StringView::QUtf8StringView()
200
201 Constructs a null string view.
202
203 \sa isNull()
204*/
205
206/*!
207 \fn QUtf8StringView::QUtf8StringView(const storage_type *d, qsizetype n)
208 \internal
209*/
210
211/*!
212 \fn QUtf8StringView::QUtf8StringView(std::nullptr_t)
213
214 Constructs a null string view.
215
216 \sa isNull()
217*/
218
219/*!
220 \fn template <typename Char> QUtf8StringView::QUtf8StringView(const Char *str, qsizetype len)
221
222 Constructs a string view on \a str with length \a len.
223
224 The range \c{[str,len)} must remain valid for the lifetime of this string view object.
225
226 Passing \nullptr as \a str is safe if \a len is 0, too, and results in a null string view.
227
228 The behavior is undefined if \a len is negative or, when positive, if \a str is \nullptr.
229
230 This constructor only participates in overload resolution if \c Char is a compatible
231 character type. The compatible character types are: \c char8_t, \c char, \c{signed char} and
232 \c{unsigned char}.
233*/
234
235/*!
236 \fn template <typename Char> QUtf8StringView::QUtf8StringView(const Char *first, const Char *last)
237
238 Constructs a string view on \a first with length (\a last - \a first).
239
240 The range \c{[first,last)} must remain valid for the lifetime of
241 this string view object.
242
243 Passing \c \nullptr as \a first is safe if \a last is \nullptr, too,
244 and results in a null string view.
245
246 The behavior is undefined if \a last precedes \a first, or \a first
247 is \nullptr and \a last is not.
248
249 This constructor only participates in overload resolution if \c Char is a compatible
250 character type. The compatible character types are: \c char8_t, \c char, \c{signed char} and
251 \c{unsigned char}.
252*/
253
254/*!
255 \fn template <typename Char> QUtf8StringView::QUtf8StringView(const Char *str)
256
257 Constructs a string view on \a str. The length is determined
258 by scanning for the first \c{Char(0)}.
259
260 \a str must remain valid for the lifetime of this string view object.
261
262 Passing \nullptr as \a str is safe and results in a null string view.
263
264 This constructor only participates in overload resolution if \a str
265 is not an array and if \c Char is a compatible character type. The
266 compatible character types are: \c char8_t, \c char, \c{signed char} and
267 \c{unsigned char}.
268*/
269
270/*!
271 \fn template <typename Char, size_t N> QUtf8StringView::QUtf8StringView(const Char (&string)[N])
272
273 Constructs a string view on the character string literal \a string.
274 The view covers the array until the first \c{Char(0)} is encountered,
275 or \c N, whichever comes first.
276 If you need the full array, use fromArray() instead.
277
278 \a string must remain valid for the lifetime of this string view
279 object.
280
281 This constructor only participates in overload resolution if \a string
282 is an actual array and if \c Char is a compatible character type. The
283 compatible character types are: \c char8_t, \c char, \c{signed char} and
284 \c{unsigned char}.
285
286 \sa fromArray()
287*/
288
289/*!
290 \fn template <typename Container, if_compatible_container<Container>> QUtf8StringView::QUtf8StringView(const Container &str)
291
292 Constructs a string view on \a str. The length is taken from \c{std::size(str)}.
293
294 \c{std::data(str)} must remain valid for the lifetime of this string view object.
295
296 This constructor only participates in overload resolution if \c Container is a
297 container with a compatible character type as \c{value_type}. The
298 compatible character types are: \c char8_t, \c char, \c{signed char} and
299 \c{unsigned char}.
300
301 The string view will be empty if and only if \c{std::size(str) == 0}. It is unspecified
302 whether this constructor can result in a null string view (\c{std::data(str)} would
303 have to return \nullptr for this).
304
305 \sa isNull(), isEmpty()
306*/
307
308/*!
309 \fn template <typename Char, size_t Size, if_compatible_char<Char>> QUtf8StringView::fromArray(const Char (&string)[Size])
310
311 Constructs a string view on the full character string literal \a string,
312 including any trailing \c{Char(0)}. If you don't want the
313 null-terminator included in the view then you can chop() it off
314 when you are certain it is at the end. Alternatively you can use
315 the constructor overload taking an array literal which will create
316 a view up to, but not including, the first null-terminator in the data.
317
318 \a string must remain valid for the lifetime of this string view
319 object.
320
321 This function will work with any array literal if \c Char is a
322 compatible character type. The compatible character types
323 are: \c char8_t, \c char, \c{signed char} and \c{unsigned char}.
324*/
325
326/*!
327 \fn QString QUtf8StringView::toString() const
328
329 Returns a deep copy of this string view's data as a QString.
330
331 The return value will be a null QString if and only if this string view is null.
332*/
333
334/*!
335 \fn QUtf8StringView::data() const
336
337 Returns a const pointer to the first code point in the string view.
338
339 \note The character array represented by the return value is \e not null-terminated.
340
341 \sa begin(), end(), utf8()
342*/
343
344/*!
345 \fn QUtf8StringView::utf8() const
346
347 Returns a const pointer to the first code point in the string view.
348
349 The result is returned as a \c{const char8_t*}, so this function is only available when
350 compiling in C++20 mode.
351
352 \note The character array represented by the return value is \e not null-terminated.
353
354 \sa begin(), end(), data()
355*/
356
357/*!
358 \fn QUtf8StringView::const_iterator QUtf8StringView::begin() const
359
360 Returns a const \l{STL-style iterators}{STL-style iterator} pointing to the first code point in
361 the string view.
362
363 This function is provided for STL compatibility.
364
365 \sa end(), cbegin(), rbegin(), data()
366*/
367
368/*!
369 \fn QUtf8StringView::const_iterator QUtf8StringView::cbegin() const
370
371 Same as begin().
372
373 This function is provided for STL compatibility.
374
375 \sa cend(), begin(), crbegin(), data()
376*/
377
378/*!
379 \fn QUtf8StringView::const_iterator QUtf8StringView::end() const
380
381 Returns a const \l{STL-style iterators}{STL-style iterator} pointing to the imaginary
382 code point after the last code point in the list.
383
384 This function is provided for STL compatibility.
385
386 \sa begin(), cend(), rend()
387*/
388
389/*! \fn QUtf8StringView::const_iterator QUtf8StringView::cend() const
390
391 Same as end().
392
393 This function is provided for STL compatibility.
394
395 \sa cbegin(), end(), crend()
396*/
397
398/*!
399 \fn QUtf8StringView::const_reverse_iterator QUtf8StringView::rbegin() const
400
401 Returns a const \l{STL-style iterators}{STL-style} reverse iterator pointing to the first
402 code point in the string view, in reverse order.
403
404 This function is provided for STL compatibility.
405
406 \sa rend(), crbegin(), begin()
407*/
408
409/*!
410 \fn QUtf8StringView::const_reverse_iterator QUtf8StringView::crbegin() const
411
412 Same as rbegin().
413
414 This function is provided for STL compatibility.
415
416 \sa crend(), rbegin(), cbegin()
417*/
418
419/*!
420 \fn QUtf8StringView::const_reverse_iterator QUtf8StringView::rend() const
421
422 Returns a \l{STL-style iterators}{STL-style} reverse iterator pointing to one past
423 the last code point in the string view, in reverse order.
424
425 This function is provided for STL compatibility.
426
427 \sa rbegin(), crend(), end()
428*/
429
430/*!
431 \fn QUtf8StringView::const_reverse_iterator QUtf8StringView::crend() const
432
433 Same as rend().
434
435 This function is provided for STL compatibility.
436
437 \sa crbegin(), rend(), cend()
438*/
439
440/*!
441 \fn bool QUtf8StringView::empty() const
442
443 Returns whether this string view is empty - that is, whether \c{size() == 0}.
444
445 This function is provided for STL compatibility.
446
447 \sa isEmpty(), isNull(), size(), length()
448*/
449
450/*!
451 \fn bool QUtf8StringView::isEmpty() const
452
453 Returns whether this string view is empty - that is, whether \c{size() == 0}.
454
455 This function is provided for compatibility with other Qt containers.
456
457 \sa empty(), isNull(), size(), length()
458*/
459
460/*!
461 \fn bool QUtf8StringView::isNull() const
462
463 Returns whether this string view is null - that is, whether \c{data() == nullptr}.
464
465 This functions is provided for compatibility with other Qt containers.
466
467 \sa empty(), isEmpty(), size(), length()
468*/
469
470/*!
471 \fn qsizetype QUtf8StringView::size() const
472
473 Returns the size of this string view, in UTF-8 code points (that is,
474 multi-byte sequences count as more than one for the purposes of this function, the same
475 as surrogate pairs in QString and QStringView).
476
477 \sa empty(), isEmpty(), isNull(), length()
478*/
479
480/*!
481 \fn QUtf8StringView::length() const
482
483 Same as size().
484
485 This function is provided for compatibility with other Qt containers.
486
487 \sa empty(), isEmpty(), isNull(), size()
488*/
489
490/*!
491 \fn QUtf8StringView::operator[](qsizetype n) const
492
493 Returns the code point at position \a n in this string view.
494
495 The behavior is undefined if \a n is negative or not less than size().
496
497 \sa at(), front(), back()
498*/
499
500/*!
501 \fn QUtf8StringView::at(qsizetype n) const
502
503 Returns the code point at position \a n in this string view.
504
505 The behavior is undefined if \a n is negative or not less than size().
506
507 \sa operator[](), front(), back()
508*/
509
510/*!
511 \fn QUtf8StringView::front() const
512
513 Returns the first code point in the string view. Same as first().
514
515 This function is provided for STL compatibility.
516
517 \warning Calling this function on an empty string view constitutes
518 undefined behavior.
519
520 \sa back()
521*/
522
523/*!
524 \fn QUtf8StringView::back() const
525
526 Returns the last code point in the string view. Same as last().
527
528 This function is provided for STL compatibility.
529
530 \warning Calling this function on an empty string view constitutes
531 undefined behavior.
532
533 \sa front()
534*/
535
536/*!
537 \fn QUtf8StringView::mid(qsizetype pos, qsizetype n) const
538
539 Returns the substring of length \a n starting at position
540 \a pos in this object.
541
542 \deprecated Use sliced() instead in new code.
543
544 Returns an empty string view if \a n exceeds the
545 length of the string view. If there are less than \a n code points
546 available in the string view starting at \a pos, or if
547 \a n is negative (default), the function returns all code points that
548 are available from \a pos.
549
550 \sa first(), last(), sliced(), chopped(), chop(), truncate()
551*/
552
553/*!
554 \fn QUtf8StringView::left(qsizetype n) const
555
556 \deprecated Use first() instead in new code.
557
558 Returns the substring of length \a n starting at position
559 0 in this object.
560
561 The entire string view is returned if \a n is greater than or equal
562 to size(), or less than zero.
563
564 \sa first(), last(), sliced(), chopped(), chop(), truncate()
565*/
566
567/*!
568 \fn QUtf8StringView::right(qsizetype n) const
569
570 \deprecated Use last() instead in new code.
571
572 Returns the substring of length \a n starting at position
573 size() - \a n in this object.
574
575 The entire string view is returned if \a n is greater than or equal
576 to size(), or less than zero.
577
578 \sa first(), last(), sliced(), chopped(), chop(), truncate()
579*/
580
581/*!
582 \fn QUtf8StringView::first(qsizetype n) const
583
584 Returns a string view that contains the first \a n code points
585 of this string view.
586
587 \note The behavior is undefined when \a n < 0 or \a n > size().
588
589 \sa last(), sliced(), chopped(), chop(), truncate()
590*/
591
592/*!
593 \fn QUtf8StringView::last(qsizetype n) const
594
595 Returns a string view that contains the last \a n code points of this string view.
596
597 \note The behavior is undefined when \a n < 0 or \a n > size().
598
599 \sa first(), sliced(), chopped(), chop(), truncate()
600*/
601
602/*!
603 \fn QUtf8StringView::sliced(qsizetype pos, qsizetype n) const
604
605 Returns a string view containing \a n code points of this string view,
606 starting at position \a pos.
607
608 \note The behavior is undefined when \a pos < 0, \a n < 0,
609 or \a pos + \a n > size().
610
611 \sa first(), last(), chopped(), chop(), truncate()
612*/
613
614/*!
615 \fn QUtf8StringView::sliced(qsizetype pos) const
616
617 Returns a string view starting at position \a pos in this object,
618 and extending to its end.
619
620 \note The behavior is undefined when \a pos < 0 or \a pos > size().
621
622 \sa first(), last(), chopped(), chop(), truncate()
623*/
624
625/*!
626 \fn QUtf8StringView::chopped(qsizetype n) const
627
628 Returns the substring of length size() - \a n starting at the
629 beginning of this object.
630
631 Same as \c{first(size() - n)}.
632
633 \note The behavior is undefined when \a n < 0 or \a n > size().
634
635 \sa sliced(), first(), last(), chop(), truncate()
636*/
637
638/*!
639 \fn QUtf8StringView::truncate(qsizetype n)
640
641 Truncates this string view to \a n code points.
642
643 Same as \c{*this = first(n)}.
644
645 \note The behavior is undefined when \a n < 0 or \a n > size().
646
647 \sa sliced(), first(), last(), chopped(), chop()
648*/
649
650/*!
651 \fn QUtf8StringView::chop(qsizetype n)
652
653 Truncates this string view by \a n code points.
654
655 Same as \c{*this = first(size() - n)}.
656
657 \note The behavior is undefined when \a n < 0 or \a n > size().
658
659 \sa sliced(), first(), last(), chopped(), truncate()
660*/
661
662/*!
663 \fn int QUtf8StringView::compare(QLatin1StringView str, Qt::CaseSensitivity cs) const
664 \fn int QUtf8StringView::compare(QUtf8StringView str, Qt::CaseSensitivity cs) const
665 \fn int QUtf8StringView::compare(QStringView str, Qt::CaseSensitivity cs) const
666 \since 6.5
667
668 Returns an integer that compares to zero as this string view compares to the
669 string view \a str.
670
671 \include qstring.qdocinc {search-comparison-case-sensitivity} {comparison}
672*/
673
674
675/*!
676 \fn QUtf8StringView::isValidUtf8() const
677
678 Returns \c true if this string contains valid UTF-8 encoded data,
679 or \c false otherwise.
680
681 \since 6.3
682*/
683
684/*!
685 \fn template <typename QStringLike> qToUtf8StringViewIgnoringNull(const QStringLike &s);
686 \relates QUtf8StringView
687 \internal
688
689 Convert \a s to a QUtf8StringView ignoring \c{s.isNull()}.
690
691 Returns a string view that references \a{s}'s data, but is never null.
692
693 This is a faster way to convert a QByteArray to a QUtf8StringView,
694 if null QByteArrays can legitimately be treated as empty ones.
695
696 \sa QByteArray::isNull(), QUtf8StringView
697*/